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The  existence  of  parallel  computing  systems  and  the  important 

applications  of  geometric  solutions  have  motivated  our  study  on  the  design 

and  analysis  algorithms  for  solving  geometric  problems  on  two  parallel 

computing  systems:  the  Shared  Memory  Machine  (SMM)  and  the  Cube-Connected- 

Cycles  (CCC).  The  validity  of  the  first  SMM  resides  in  uncovering  the 

inherent  data-dependence  of  the  problems,  while  that  of  the  CCC,  which 

complies  with  the  VLSI  technological  constraints,  is  the  development  of 

practical  parallel  algorithms.  It  is  shown  that  solutions  to  geometric 

problems  can  be  organized  to  reveal  a  large  amount  of  parallelism,  which 

can  be  exploited  to  substantially  reduce  the  computation  time  .y^recisel^ 

using  the  SMM  with  a  number  of  processors  and  memory  units  linear  in  the 

problem  size,  algorithms  are  developed  to  solve  problems  of  reporting 
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intersection  of  N  rectangles  in  time  0((logN)  +k),  where  k  Is  the 
miH mim  number  of  Intersections  per  rectangle, 
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intersection  of  N  rectangles  in  time  0((logN)  ),  planar  point  location 
2 

in  time  0((logN)  loglogN),  finding  the  two-dimensional  convex  hull  of  N 
points  in  time  0((logN)^),  the  three-dimensional  convex  hull  of  N  - 
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points  in  time  0((logN)  loglogN),  and  constructing  the  planar  Voronoi 
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diagram  of  N  points  in  time  0((logN)  loglogN).  Using  the  CCC  with  a 
number  of  processors  linear  in  the  problem  size,  the  parallel  algorithms 
developed  for  all  of  these  problems,  except  reporting  intersection  of 
rectangles  and  constructing  the  two-dimensional  convex  hull, 
have  time  complexity  Increased  only  by  a  factor  of  logN/loglogN  with 


respect  to  that  on  the  SMM.  The  algorithms  for  reporting  Intersection 

of  rectangles  and  for  constructing  the  two-dimensional  convex  hull  on 

the  CCC  have  the  same  time  complexity  as  that  on  the  SMM.  With  an 
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increase  in  the  number  of  processors  of  the  CCC  to  N  (0  <  or  £  1) , 
all  of  these  problems  can  be  solved  with  algorithms  of  time  complexity 
improved  by  a  factor  of  1/ (alogN)  with  respect  to  that  on  the  CCC  with  N 
processors . 
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CHAPTER  1 


INTRODUCTION 

The  existence  of  parallel  computers  [5,10,15,32,38]  has  motivated  tha 
development  of  parallel  algorithms  for  solving  many  problems.  These  prob¬ 
lems  Include  both  numerical  and  non-mimerical  problems  like  matrix  problems 
[11,14,36],  polynomial  evaluation  [24,25],  arithmetic  computation  [23], 
graph  problems  [3,12,17,33],  and  sorting  [16,29,37].  A  recent  development 
in  applied  computation  theory  has  been  the  solution  of  geometric  problems 
by  a  uniprocessor  system  [6,8,20,27,34].  It  is  illustrated  in  [34]  that 
geometric  problems  are  frequently  encountered  in  operation  research,  pattern 
recognition,  computer  graphics,  and  statistics. 

The  topic  of  this  thesis  is  the  study  of  the  solution  of  geometric 
problems  by  parallel  computing  systems.  We  shall  design  and  analyze  parallel 
algorithms  with  references  to  two  systems:  the  shared  memory  machine  [26] 
and  the  cube-connected-cycles  [31].  The  validity  of  the  first  model  resides 
in  uncovering  the  inherent  data-dependence  of  given  problems,  while  that  of 
the  second  is  the  development  of  practical  algorithms. 

1.1  Parallel  Computing  Systems 

A  meaningful  study  of  the  design  and  analysis  of  parallel  algorithms 
requires  a  precise  model  of  computation.  In  this  section,  we  shall 
describe  two  systems  which  are  adopted  in  this  thesis. 

1.1.1  The  Shared  Memory  Machine  (S MW) 

Several  workers  have  designed  and  analyzed  efficient  parallel 
algorithms  with  reference  to  a  shared  memory  machine  [3,10,14,16,17,29, 
33,37].  In  this  model  (refer  to  Figure  1),  the  processors  can  comnunicate 
with  each  other  through  memory.  Each  processor  is  capable  of  performing 


processors 


memory  units 
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arithmetic  operations,  boolean  operations,  comparisons  and,  possibly,  the 
calculations  of  trigonometric  functions  in  unit  time.  The  main  memory 
consists  of  a  number  of  parallel  memory  units,  each  of  which  contains  a 
sufficient  number  of  words.  It  takes  constant  time  to  transmit  data 
from  any  processor  to  any  memory  unit  and  vice  versa.  Processors  are 
allowed  to  simultaneously  read  from,  but  not  write  in,  the  seise  word. 
However,  two  processors  are  not  permitted  to  read  or  write  into  different 
words  of  the  same  memory  unit.  (This  situation  is  referred  to  as  memory 
conflicts .) 

We  shall  assume  that  Che  processors  are  indexed  0  through  n-1  and  the 
memory  units  are  indexed  from  0  through  m-1.  Arrays  A(0:m-1)  of  elements 
A(0) , . . .  ,A(m-l)  are  stored  systematically  in  the  main  memory  such  that 
A(i)  is  in  memory  unit  i. 

1.1.2  The  Cube  Machine  (CM)  and  the  Cube-Connected-Cycles  (CCC) 

In  these  models  there  is  no  shared  memory.  Each  processor  has  a 
private  RAM  memory.  Each  processor,  as  in  the  SMM,  is  capable  of  per¬ 
forming  arithmetic  operations,  boolean  operations,  comparisons  and 
calculating  trigonometric  functions  in  unit  time. 

r.  kt. 

Assume  that  n  ■  2  and  let  BITj(a)  be  the  (j+1)  least  significant 
bit  in  the  binary  expansion  of  a.  In  the  Cube  Machine,  the  processors 
are  interconnected  as  a  k  dimensional  cube,  that  is,  processor  1  is 
connected  to  processors  t  +  (l-2BITj (i))2^ ,  0  ^  j  <  k.  Data  may  be 
transmitted  from  one  processor  tc  another  only  via  this  interconnection 


pattern. 
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Processor  i  can  be  identified  by  a  pair  of  integers  (X,p)  such  that 

i‘2r  +  p  *  i  where  r  is  the  smallest  integer  for  r  +  2r  s  k.  In  the  cube- 

connected-cycles,  which  was  recently  proposed  by  Preparata  and  Vuillemin 

[28],  processor  (f, p)  is  connected  to  processor  (i , (p  +  l)mod  2r) . 

(X.(p-l)mod  2r)  and  l  (1  -  2BIT (f))2P,p),  (refer  to  Figure  2).  The 

c  p 

geometric  structure  underlying  the  interconnection  of  the  processors  is 
that  of  a  k-dlmensional  cube ,  but  the  CCC  requires  only  three  connections 
per  processor.  Once  again,  data  transmission  from  processor  to  processor 
is  possible  only  via  the  available  connections. 

The  development  of  algorithms  with  reference  to  the  CCC,  unlike  that 
on  the  SMM  which  considers  only  the  data-dependence,  concerns  also  the  data- 
movement.  Moreover,  this  machine  complies  with  the  present  technological 
constraints  of  VLSI  design  [22].  It  is  shown  that  the  CCC  is  remarkably 
suited  for  implementing  efficient  algorithms  such  as  Radix-2  Fast  Fourier 
Transform,  Bi tonic  Sorting,  etc. 

Algorithms  for  some  interesting  problems  -  such  as  bi tonic  merge  and 

cyclic  shift  -  perform  a  sequence  of  basic  operations  on  data  which  are 
k-1  k-2  0 

su  cessively  2  ,2  ,...,2  ■  1  locations  apart.  This  class  of  algorithms 

is  referred  to  as  DESCEND  class  [31].  The  dual  class  ASCEND  consists  of 

algorithms  which  perform  a  sequence  of  basic  operations  on  data  that 

are  successively  1  »  2^,2^", ...  ,2^  ^  locations  apart.  Algorithms  in 

DESCEND  class  are  of  the  form: 

for  i  *-  k-1  down  to  0  do  . 

foreach  j,  0  £  3  <  2K  do 

if  BIT^(j)  «  0  then  0PER(A(J)  ,A(J  +  2X)); 

where  0PER(A(j),  A(j+2^))  is  some  basic  operation  on  the  operands  A(i) 
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and  A(j  +  2^)>  ASCEND  differs  from  DESCEND  only  in  Che  control  loop.  The 

control  loop  of  ASCEND  is:  for  i  -  0  to  k-1  do.  In  both  cases,  the  number 

of  parallel  steps  on  the  CM  is  clearly  k.  In  [28],  Preparata  and  Vuillemin 

show  that  algorithms  in  both  classes  can  be  implemented  on  the  CCC  in  k 

parallel  steps.  They  also  show  that  other  problems  (such  as  permutation. 

,  unshuffle,  bit  reversal,  odd-even  merge.  Fas t-Fourier-Transf orm. 

convolution,  matrix  transposition)  having  programs  consisting  of  short 

sequence  of  algorithms  in  the  DESCEND  or  ASCEND  classes  run  in  0(k)  parallel 

steps  on  the  CCC.  There  are  also  applications  -  such  as  bitonic  sort. 

odd -even  sort,  and  calculations  of  symmetric  functions  -  for  which  the 

combining  step  of  the  two  results  of  a  recursive  call  is  itself  an  algorithm 

2 

in  the  DESCEND  or  ASCEND  class.  These  algorithms  run  in  O((logn)  )  parallel 
steps  on  the  CCC. 

1.2  Class  of  Problems  Considered 

In  this  paper,  parallel  algorithms  are  presented  for  several  geometric 
problems,  based  on  the  parallel  computing  systems  described  in  Section  1.1. 
The  geometric  problems  which  are  considered  here  are  the  following. 

We  first  consider  a  subproblem  of  the  intersection  problems.  Given  a 
set  of  N  rectangles  with  their  sides  parallel  to  the  coordinate  axes,  we 
want  to  report  any  pair  of  rectangles  which  Intersect.  Apart  from  being 
interesting  in  its  own  right,  this  problem  has  an  important  application 
in  VLSI  circuitry  design  rule  checking  [4,19].  Bentley  and  Wood  [7] 
recently  Investigated  this  problem  for  a  uniprocessor  system  and  developed 
an  0(NlogN+k) ^  time  algorithm  for  reporting  all  k  such  intersecting  pairs. 

^All  logarithms  in  this  thesis  are  to  the  base  2. 
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We  shall  develop  algorithms  for  solving  the  rectangle  intersection  problem 
on  the  SMM  and  on  the  CCC.  As  two  Intermediate  steps  in  our  approach, 
we  shall  study  the  problems  of  reporting  Intersecting  pairs  of  horizontal 
and  vertical  line  segments  and  of  two  dimensional  range  searching.  The 
latter  problem  is  also  important  in  its  own  right  and  has  applications  in 
the  database  systems. 

The  second  problem  to  be  studied  is  an  inclusion  problem.  Given  a 
planar  graph  embedded  in  the  plane  as  a  straight  line  graph  G  [21]  with 
N  vertices  and  a  set  of  M  points,  for  each  of  these  M  points,  we  have  to 
find  the  region  of  the  planar  subdivision  induced  by  G  which  contains  it. 
In  short,  we  shall  refer  to  this  problem  as  planar  point  location.  This 
problem  is  quite  important  in  computational  geometry.  Indeed,  point  loca¬ 
tion  is  a  crucial  step  in  our  three-dimensional  convex  hull  algorithms  to 
be  developed.  The  most  recent  and  practical  sequential  result  is  due  to 
Preparata  [28].  This  algorithm  runs  in  time  0(MlogN)  on  a  data  structure 
which  can  be  constructed  in  time  0 (NlogN) . 

The  next  two  problems  to  be  investigated  are  two-dimensional  and 
three-dimensional  convex  hulls.  Given  a  set  S  of  N  points,  the  convex 
hull  CH(S)  of  S  is  the  intersection  of  all  convex  sets  containing  S. 

The  convex  hull  CH(S)  is  a  convex  polyhedral  region.  Chapter  3  of  [34] 
demonstrates  the  importance  of  the  convex  hull  problems,  which  arise  in 
statistics,  numerical  analysis,  and  image  processing,  as  well  as  in  many 
other  fields.  Preparata  and  Hong  [30]  show  that  the  convex  hulls  of 
sets  of  points  in  both  two  dimensions  or  three  dimensions  can  be 
determined  serially  with  0 (NlogN)  operations. 
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The  1m t  problem  is  Che  construction  of  the  Voronoi  diagram  for  a 
set  of  N  points  in  die  plane.  A  Voronoi  diagram  is  a  partition  of  the 
plane  into  N  polygonal  regions,  each  of  which  is  associated  with  a  given 
poinc  and  is  the  locus  of  points  closer  to  the  given  point  than  to  any 
other  point.  This  problem  arises  in  clustering  analysis  [13]  and  in  the 
context  of  several  closest-point  problems  [35 ] .  While  optimal  0(NlogN) 
serial  algorithms  exist,  we  shall  consider  the  construction  of  Voronoi 
diagrams  on  the  SMM  and  on  the  CCC. 

We  shall  develop  algorithms  for  the  above  problems  on  the  SMM 
with  a  number  of  processors  linear  in  the  problem  size  and  on  the 
cube  machine  with  numbers  of  processors  both  linear  and  superlinear  in 
the  problem  size.  The  algorithms  that  we  developed  for  the  cube 
machine  are  ASCEND  and  DESCEND  programs,  therefore  they  can  be 
implemented  on  the  CCC  without  significantly  increasing  the  time 
complexity. 

1.3  Outline  of  Thesis 

In  the  next  chapter  we  develop  some  basic  tools  which  will  be  used 
in  later  chapters.  Each  of  the  next  five  chapters  is  devoted  to  a 
problem  described  in  Section  1.2.  Each  chapter  consists  of  three  main 
algorithms:  the  first  for  the  SMM  and  the  second  for  Che  CCC,  both 
with  a  number  of  processors  linear  in  the  problem  size;  the  last  one 
for  the  CCC  with  a  number  of  processors  superlinear  in  the  problem 


size. 
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Chapter  3  is  on  intersection  of  rectangles.  Chapter  4  is  on 
planar  point  location.  Chapters  S  and  6  are  on  convex  hulls  in  two 
dimensions  and  three  dimensions  respectively.  Chapter  7  is  on  the 
construction  of  Voronoi  diagrams.  In  Chapter  8  conclusions  are 
dram. 
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CHAPTER  2 
BASIC  ALGORITHM’S 

In  this  Chests,  parallel  algorithms  are  soughc  for  various  geometric 
problems.  The  strategy  used  to  develop  an  algorithm  for  a  given  problem 
is  to  devise  a  technique  which  reduces  the  solution  of  the  problem  to 
the  solution  of  a  sequence  of  problems  for  which  efficient  parallel 
algorithms  can  be  developed.  In  anticipation  of  later  use,  we  develop 
some  basic  parallel  algorithms. 

2.1  On  the  SMM  with  N  Processors 

2 

We  shall  discuss  the  problem  of  data  extraction  and  the  0((logN)  ) 
time  solution  for  finding  the  minimum  or  maximum  of  a  set  of  N  numbers. 
2.1.1  Data  Extraction 

We  consider  the  following  extraction  problem.  Given  an  ordered 

array  A(0:N-1)  and  an  associated  array  t(0:N-l)  of  tags,  we  want  to  move 

elements  A(i),  with  t(i)  *  1,  to  consecutive  memory  units  in  a  stable 

fashion,  i.e.,  preserving  the  original  order. 

We  first  determine  the  rank  R(i)  of  element  A(i),  which  is  the 

number  of  elements  preceding  it  and  with  tags  being  set  to  1.  Then 

elements  with  tags  equal  to  1  are  moved  to  consecutive  memory  units 

defined  by  their  ranks.  We  use  Nassimi's  ranking  algorithm:  The 

algorithm  is  best  described  recursively.  Divide  a  2  element  set  into 

k-1 

two  halves,  each  containing  2  consecutive  elements.  Let  R(i)  be  the 

k-1 

rank  of  A(l)  in  the  2  -set.  Let  S (1)  be  the  total  number  of  elements 
k-1 

in  the  2  -set  containing  A(i)  with  tags  equal  to  one.  Then  the  rank 

ic 

of  an  element  in  a  2  -set  is  R(i)  if  BIT^_^(i)  equals  to  0  (note  that 

BIT|t_1(i)  »  0  for  the  left  2k~1-set  of  a  2k-set)  and  R(i)  + 

k-1 

if  BIT^_^(i)  equals  to  1.  (Note  that  S(i-2  )  is  constant  for  all  terms 
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of  the  left  2^”^-set.)  Unfolding  the  recursion  yields  the  iterative 
procedure  BANK: 


procedure  RANK(A,t,R): 

/*  determine  R(i)  ■  number  of  A(j)  for  which  t(j)  »  1  and  j  <  i  */ 
begin 

foreach  i,  0  ^  i  <  N  do 
begin  R(i)  —  0 

if  t(i)  -  1  then  S(i)  -  1  else  S(i)  -  0 
end  *"* 

for  k  -  0  to  logN-1  do 

foreach  i,  0  i  i  <  N  do  , 

begin  T(i-t-  <l-2RITk(i))2lc)  -  S(i) 

if  BITfc(i)  ■  1  then  R(i)  -  R(i)+T(i) 

S(i)  -  S(i)+T(i) 

end 

end 


It  is  easy  to  see  that  procedure  RANK  runs  in  time  O(logN)  on  a  SMM 
with  N  processors  and  N  memories.  We  are  now  able  to  describe  the  entire 
procedure  EXTRACT 1.  (|a|  is  the  number  of  elements  with  tag  *  1). 

procedure  EXTRACT1  (A,t): 

/*  extract  elements  A(l)  with  t(l)  “  1  and  move  them  to  consecutive 
memory  units  beginning  at  unit  0  */ 
begin 


/*  determine  the  rank  R(i)  of  each  element  A(i)  */ 
call  RANK  (A, t,R) 

/*  route  A(i)  to  R(i)  */ 
foreach  i,  OS  i  <  N  do 
T(l)  -  A(i) 

if  t(i)  -  1  then  A(R(i))  -  T(i) 
end 

/*  determine  |a|  and  fill  the  right  end  of  A  with  null  */ 
if  t(N-l)  -  0  then  |a|  -  R(N-l)  else  |a|  -  R(n-1)+1 
foreach  i,  |a|  S  i  <  N  do  A(i)  -  null 

end 

The  time  complexity  of  EXTRACT 1  is  mainly  determined  by  the  first  step 
which  calls  procedure  RANK.  Therefore,  procedure  EXTRACT1  runs  in  time 
O(logN)  on  a  SMM  with  N  processors  and  N  memories. 


Theorem  2.1.  A  selected  subset  of  eo  ordered  array  A(0:N-1)  of  elements 

can  be  moved  to  consecutive  memory  units  in  a  stable  fashion  in  time 

O(logN)  on  a  SMM  with  N  processors  and  N  memory  units. 

2.1.2  Finding  the  Minimum  (Maximum)  of  N  Numbers 

We  now  review  a  well-known  O(logN)  time  algorithm  for  finding  the 

minimum  of  a  set  S  of  N  numbers:  we  first  partition  S  into  two  subsets 

S^  and  S£  of  equal  size.  We  then  find  the  minima  of  Sj,  and  of  S2 

simultaneously.  The  minimum  of  S  is  the  smaller  number  between  m^  and  m^ 

It  can  be  written  as  follows. 

function  MINIMUM  (S) 

/*  returns  the  minimum  of  S  */ 
begin  foreach  i,  Oi  i  <  H  do  S'  (i)  **  S  (i) 
for  k  -  0  to  logN-i  do 

foreach  i,  0  S  i  <  N  do 

if  BIT,  (i)  -  0  Aen  .  . 

if  S ' (i)  >  S ' (i+2;  then  S ' (i)  -  S ' (i+2K) 

return  (S ' (0) ) 

end 

Similarly,  we  can  find  the  maximum  of  N  numbers  on  a  SMM  with  N 
processors . 

Theorem  2.2.  The  minimum  (maximum)  of  N  numbers  can  be  determined  in  time 
O(logN)  on  a  SMM  with  N  processors. 

2.2  On  the  CCC  with  N  Processors 

We  shall  discuss  some  basic  tools  like  data  extraction,  selected 


broadcasting,  parallel  searching,  and  finding  the  minimum  (maximum)  of  N 
numbers.  We  shall  develop  efficient  algorithms  for  these  problems  on  a 
CCC  with  a  number  of  processors  linear  in  the  problem  size. 
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2.2.1  Data  Extraction 

Procedure  EXTRACT 1  described  la  Section  2.1.1  is  not  suitable  for 

implementation  on  the  CCC.  The  step  which  is  causing  difficulties  is  the 

routing  of  data  to  appropriate  processors  as  determined  by  the  data  rank. 

The  routing  will  be  referred  to  as  concentration.  During  concentration, 

selected  data  are  moved  to  consecutive  processors.  Nassiml  [26]  solved 

this  problem  on  a  CM  as  follows:  Let  t(i),  wimn  it  is  equal  to  1,  be  the 

indicator  that  data  item  A(i)  is  to  be  moved  to  the  RCi)^  processor. 

First,  data  A(i),  with  t(i)  ■  1,  are  moved  to  processors  such  that  the 
processor  index  and  R(i)  agree  in  bit  position  0.  The  next  routing 

assures  that  processor  indices  and  R(i)  agree  in  bit  positions  0  and  1; 

and  so  on  until  data  are  routed  to  the  correct  processors.  Figure  3 

is  an  example  of  concentration  with  t(i)  *  1  for  i  -  1,2, 4, 7.  Figure  3(a) 

shows  the  initial  values  of  R(i)  in  binary.  The  first,  second,  and  third 

iterations  of  the  above  procedure  yield  the  configurations  of  Figures  3(b), 

3(c)  and  3(d)  respectively.  The  third  iteration  completes  the  concentration. 

The  formal  description  of  the  concentration  algorithm  is  as  follows: 

procedure  CONCENTRATE (A,R , t) : 

/*  route  A(i)  with  t(i)  -  1  to  processor  R(i) .  This  procedure  will  be 
used  to  move  data  A(l)  with  t(i)  ■  1  to  consecutive  processors  */ 
begin  for  k  0  to  logN-1  do 
f oreach  i,  06  i  <  N  do 

if  t(i)  -  1  and  BITk(i)  *  BlTk(R(i)) 

then  begin  A(i+(l-BITk(i))2k)  -  A(i) 

R(i+(l-BITk(i))2k)  -  R(i) 

t(i-Kl-BIT.  (i))2k)  -  t(i) 

end 

end 


(a)  Initial  configuration 
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Ic  is  straightforward  to  sea  that  procedura  CONCENTRATE  can  ba 
implemented  on  a  CCC  with  N  procassors  in  O(logN)  staps;  and  proeadura 
RANK,  which  is  introduced  in  Section  2.1.1  to  determine  the  number  of 
elements  with  t(l)  •  1  to  the  left  of  each  data,  can  also  be  carried  out 
on  a  CCC  with  N  processors  in  O(logN)  steps.  We  now  describe  an  O(logN) 
time  data  extraction  algorithm  on  a  CCC  with  N  processors: 
procedure  EXTRACT2  (A, t) : 

/*  extract  A(i)  with  t(i)  ■  1  and  move  them  to  consecutive  processors 
beginning  at  processor  0.  */ 
begin  call  RANK(A,t,R) 

/*  determine  |a|  ■  number  of  A(l)  with  t(i)  -  1  */ 
if  t(N-l)  -  0  then  |a|  -  R(N-l)  else  |a|  -  R(N-1)+1 
call  CONCENTRATE  (A,R,t) 

/*  fill  the  right  end  of  array  A  with  null  */ 
foreach  i,  Ja|  2S  i  <  N  do  A(i)  •"  null 

end 

Theorem  2.3.  A  selected  subset  of  an  ordered  array  A(0:N-1)  of  elements 
can  be  moved  to  consecutive  memory  units  in  a  stable  fashion  on  a  CCC 
with  N  processors  in  O(logN)  steps. 

2.2.2  Selected  Broadcasting 

Being  able  to  transmit  data  efficiently  is  essential  for  a  fast  algorithm. 

We  now  consider  a  special  case  of  selected  broadcasting.  Let  P(0:  N-l)  be  a 

storage  array  and  let  (a.,..., a  }  be  a  selected  subset  of  £o,...,N-l},  where 

i  n 

at  <  We  denote  the  expression  ai+1  -  a^l  by  LCa^)  for  1*1,..., n-l,  and 

N-aQ-1  by  K*n)«  Our  objective  is  to  copy  data  D(at)  into  P(a^)  .PCa^  +  1) , . . . , 
PCa^  +  LCa^))  for  i»l,...,n.  For  example,  letting  N«9,  n*2,  a^  *  2  and 
a2»5,  we  would  copy  D(2)  into  P(2),  P(3),  P(4)  and  D(5)  into  P(5),  P(6), 

P(7)  and  P(8). 


We  shell  describe  Che  selected  broadcasting  procedure  along  with  an 

example.  Let  n  •  1,  N  «  16,  a^  »  5,  and  L(aj)  ■  5,  that  Is  we  want  to  move 

D(5)  to  P(5),P(6),...,P(10).  In  Figure  4  the  shaded  locations  show  the  data 

movement  in  selected  broadcasting.  Selected  broadcasting  is  carried  out  by 

the  same  routing  as  in  concentration:  during  the  iteration,  data  D(i)  is 

to  be  copied  Into  P(i+h) ,P(i-rtv+l) , . . .,P(i+L(i)),  where  h  •  mtn(2^,L(l)) . 

Referring  to  the  example,  during  the  0^  iteration,  L(5)  «  5  indicates  chat 

D(5)  is  to  be  copied  Into  P(6) ,P(7) , . . . ,P(10) ;  and  during  the  3rd  iteration, 

L(0)  -  10  indicates  Chat  D(0)  is  to  be  copied  into  P(8),P(9),P(10) .  If 
1c 

L(i)  2  2  ,  we  move  data  D(i)  to  the  processor  such  that  the  processor  index 
and  i  +  2k  agree  in  bits  0,1,..., k.  Referring  to  the  1st  iteration  of  the 
example,  0(4)  is  moved  to  processor  6;  and  referring  to  the  2nd  iteration, 

2 

D(4)  is  moved  to  processor  0,  such  that  0  and  8  -4+2  agree  in  bits 
0,1,2.  During  this  routing,  data  may  be  moving  backward  (i.e.,  moving 
to  a  processor  with  lower  index)  which  is  contrary  to  our  objective  of 
forward  broadcasting.  We  indicate  this  transitional  state  by  setting  the 

V 

flag  BACKWARD (1)  to  1.  We  have  to  adjust  L(i)  by  +  2  depending  on  whether 

data  is  moved  backward  or  forward.  In  the  example,  D(4)  is  moved  to 

processor  0  during  the  2nd  iteration,  so  the  flag  BACKWARD (0)  is  set  to  1 

2  k+l 

and  L(0)  is  assigned  to  be  L(4)  +  2  «  10.  When  L(i)  <  2  ,  we  know  that 

k+1 

D(i)  will  not  be  moved  in  later  iteration.  Moreover,  when  0  £  L(i)  <  2 
and  0(1)  is  not  in  the  backward  transitional  state,  we  can  copy  0(1)  into 
P(l)  and  set  L(i)  to  -1.  Referring  to  the  1st  iteration  of  the  example. 
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O  *  * 

L(7)  is  first  set  to  3,  so  0  ^  L(7)  2  and  BACKWARD(7)  is  0,  then  we 

can  copy  D(7)  into  P(7)  and  set  L(7)  to  -1.  We  claim  that  at  the  end  of 
(logN+1)  iterations,  the  broadcasting  is  complete.  The  program  for  the 
selected  broadcasting  is  as  follows: 


procedure  SELECTED_B ROADCASTING (D,L,P) 


/*  when  L(i)  >  0,  copy  D(i)  into  P(i),P(i+l) . P(i+L(i)). 

BACKWARD  will  be  a  flag  for  backward  transitional  stage. 
T,TL, BACKWARD  will  be  used  as  temporary  storage  for  D,L, 
BACKWARD  respectively  */ 
begin 

f preach  i,  0  1  i  <  N  do 

begin  TL(i)  -  -1,  BACKWARD (i)  -  0  end 
for  k  «-  0  to  log  N*1  do 

foreach  i,  0  ^  1  <  N  do 


1. 


2. 


3. 


/*  move  D(i)  to  the  processor  such  that  the 

processor  index  and  the  destination  agree  in 
bits  0, 1, . . . ,k  */ 

if  L(i)  *  2k  then  . 

begin  T(i+(l-2BITk(i))2  )  -  D(i) 

TL(i+(l-2BITk(i))2k)  -  L(i)+(2BITk(i) -l)2k  . 

TBACKWARD(i+(l-2BITk(i))2k)  -  BITk(i) 

end 

/*  determine  if  data  in  D(i)  is  permanent, 
discarded  or  have  to  be  saved  */ 

if  OS  L(i)  <  2k+1  then 

begin  if  BACKWARD(i)  *  0  then  P(i)  -  D(i) 

L(i)  -  -1 

end 

/*  determine  if  data  in  temporary  location  T(i)  is 
permanent,  can  be  discarded  or  have  to  be  saved  */ 

if  OS  TL(i)  <  2k+1  then  _ 

begin  if  TBACKWARD (i)  -  0  then  ?(i)  -  T(i)  , 

TL(i)  -  -1  * 

1 

I 
I 


end 
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4.  if  TL(i)  a  2k  1  then 

begin  D(i)  •"  T(i) 

L(i)  -  TL(i) 

BACKWARD  (i)  -  TBACKWARD(i) 
TL(i)  -  -1 


end 

end 

end 

The  correctness  of  S ELECTED JROADCAS T  is  not  immediate.  We  must  show 
that  (1)  whenever  data  is  to  be  stored  at  some  location,  the  previous 
information  at  that  location  can  be  discarded;  (2)  D(a^)  is  moved  to 
P(a^) , . . . ,P(a^+L(a^))  for  i  *  l,...,n  at  the  termination  of  the  procedure. 
Theorem  2.4.  Procedure  SELECTED_J5R0ADCAST  is  correct. 

Proof.  It  is  observed  that  at  the  beginning  of  each  iteration  TL(i)  ■  -1, 
Vi;  so  prior  to  step  1,  information  at  T(i),  TL(i)  and  TBACKWARD (i )  can  be 
discarded  for  Vi. 

Suppose  BITk(i)  -  0  and  L(2k+i)  a  2k  at  step  1.  Then  TL(i)  is 

Ip  V  ]p4»T 

assigned  the  value  L(2  +i)  +2  a  2  and  by  the  specific ution  of  the 
k+1 

problem,  L(i)  <2  .At  step  2,  L(i)  is  then  set  to  <•?  which  implies 
that  prior  to  step  4,  information  at  D(i),  L(i),  BACKWARD (i)  can  be 

lr 

discarded.  Suppose  BIT^Ci)  *  1  and  L(i)  a  2  at  step  1.  By  the 

k  k4>l 

specification  of  the  problem,  L(i-2  )  <  2  at  step  1.  TL(i)  may  be  set 
to  L(i-2k)  -  2k  <  2k  or  remains  -1  depending  on  the  value  of  1(1-2^;  in 
either  case  TL(i)  is  -1  at  the  completion  of  step  3.  Therefore,  step  4 
has  no  storage  conflicts. 
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To  compl«t«  the  proof,  it  is  now  sufficient  to  show  for  n  -  1,  DCa^ 
is  correctly  moved  to  P(ax), . . .  .P^+LU^)  and  data  D(ax)  is  never  moved 
1  2  • • • ,a^+L(a^)J  during  the  process.  It  is  simple  to 

see  the  routing  in  the  algorithm  guarantees  D(a^)  reaches  processors 
al,al+l’ * ’ *»ai+L<ai) •  Indicators  BACKWARD (i)  and  TBACKWARD(i)  determine 
whether  a  piece  of  data  arrives  at  processor  i  should  be  written  into  P(i). 
If  the  data  is  arriving  from  a  processor  with  higher  index  then  this  data 
is  in  a  transitional  stage,  otherwise  this  data  is  in  its  destination.  □ 

Procedure  SELECTED_BROADCAST  runs  in  time  O(logN)  on  a  CCC  with  N 
processors . 

Theorem.  2.4.  Given  a  subset  [a^...^}  of  [0 . N-l]  and  a£  <  a^, 

data  items  D^)  can  be  copied  into  P^)  ,P(ai+l>  , . . .  ,P(ai+L(ai)  )  ,  where 
L(at)  -  ai+1-ai-l,  for  i  -  l,...,n,  in  time  O(logN)  on  a  CCC  with  N 
processors . 

2.2.3  Parallel  Searching 

Given  an  array  A(0:N-1)  of  N  elements  in  ascending  order  and  a  set 
Q(0:M-1)  of  test  elements,  we  want  to  find  for  each  i,  0  <  i  <  M,  AQ^) 
such  that  A( j^)  ^  Q(i)  <  A(Jj+l) .  We  present  the  set  of  test  elements 
in  descending  order.  Then  A  and  Q  are  merged  using  Batcher's  bitonic 
merge.  Then  A(j)  is  broadcast  to  all  the  test  elements  between  A(j)  and 
A( j+1)  in  the  resulting  merged  sequence  of  A  and  Q.  For  example,  N  ■  4, 

M  »  5,  A(0) , . . . ,A(3)  are  1,  3,  4,  8  respectively,  and  Q(0) , . . . ,Q(4)  are 
1,  2,  4,  5,  6  respectively.  Figure  5(a)  shows  the  sequences.  Figure  5(b) 
shows  the  merged  sequence.  Then  A(0)  is  broadcast  to  Q(0),  Q(l),  and 
A(2)  is  broadcast  to  Q (2) ,  Q (3 ) ,  Q(4). 
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The  following  program  performs  the  parallel  searching. 


procedure  SEARCH  (A,Q,P): 

/*  determine  P(i)  ■  AQ^)  such  that  ACj^)  £  AQ^+l)  */ 


/*  merge  sequences  A  and  Q  */ 
foreach  i,  0  S  i  <  N  do  D(i)  •“  A(i) 
foreach  i,  0  S  i  <  M  do  D(N+i)  *"  Q(i) 
apply  bi tonic  merge  to  0; 

/*  determine  the  distance  L(i)  such  that  0(1)  has  to  be  broadcast  */ 
foreach  i,  0  S  i  <  N+M  do 

begin  t(i)  0;  L(i)  -  -1 

if  D(i)  €  A  and  D(i+1)  €  Q 

then  begin  t.(i)  •-  1;  FIRST (i)  -  i  end 

end 

call  EXTRACT2  (FIRST, t) 
foreach  i,  0  S  i  <  N+M  do 

if  FIRST (i)  +  null  then  L(i)  -  FlRST(i+l)-FIRST(i)-l 
move  L(i)  to  processor  FIRST(i)  by  a  procedure  similar  to  CONCENTRATE 

/*  broadcast  D(i)  to  P(i),. . ,,P(i+L(i))  */ 
call  SELECTED_BROADCAST  (D,L,P). 

/*  move  P  to  origi-'  position  */ 
foreach  i,  OS  1  <  N+M  do 

if  D(i)  €  A  then  t(i)  -  1  else  t(i)  -  0 
call  EXTRACT2  (P ,  t) 

end 

This  procedure  runs  in  time  0(log(N+M>)  with  N+M  processors.  Therefore, 

2 

parallel  searching  runs  in  time  0((logM)  +  log(N+M>)  on  a  CCC  with  N+M 
processors. 

Theorem  2.5.  Given  an  ordered  array  A(0:N-l)  of  N  elements  and  a  set 

Q(0:M-1)  of  test  elements,  for  each  1,  OS  1  <  M,  the  element  A(j^), 

such  that  A(jt)  :£  Q(i)  <  AQ^+l),  can  be  determined  in  time 
2 

0((logM)  +log(M+N))  on  a  CCC  with  N  +  M  processors. 


The  algorithm  presented  in  Section  2.1.2  for  finding  the  minimum 
(maximum)  of  N  numbers  is  directly  within  the  ASCEND  class.  Therefore, 
we  have  the  following  result. 

Theorem  2.6.  The  minimum  (maximum)  of  N  numbers  can  be  determined  in 
time  O(logN)  on  a  CCC  with  N  processors. 


CHAPTER  3 


INTERSECTION  OF  RECTANGLES 

Given  a  see  of  N  rectangles  (with  sides  parallel  to  the  coordinate 
axes)  in  the  plane,  we  are  asked  to  report  all  pairs  of  rectangles  which 
intersect.  An  important  application  of  the  problem  is  in  VLSI  design  rule 
checking  [4,19].  Bentley  and  Wood  [7]  presented  an  O(NlogNfk)  (optimal) 
time  algorithm  for  reporting  intersections  of  rectangles  on  a  uniprocessor 
machine,  where  k  is  the  number  of  intersecting  pairs  found.  In  this 
chapter  we  investigate  this  problem  on  parallel  computing  machines. 

Our  approach  to  a  parallel  solution  of  the  problem  follows  the  general 
approach  of  Bentley  and  Wood  and  requires  two  intermediate  steps:  reporting 
intersections  of  horizontal  and  vertical  line  segments,  and  two-dimensional 
range  searching.  Two  rectangles  intersect  if  their  edges  intersect  or  one 
rectangle  entirely  encloses  the  other.  The  problem  of  finding  rectangle 
enclosure  can  be  reduced  to  that  of  two-dimensional  range  searching  as 
follows.  We  associate  with  each  rectangle  A  a  representative  point  a  in 
its  interior,  for  example,  its  leftmost  bottom  vertex.  If  point  a  lies 
within  rectangle  B,  then  either  B  entirely  encloses  A  or  A  and  B  have  an 
edge  intersection. 

The  rectangles  in  the  given  set  are  indexed  0  to  N-l.  Each 
rectangle  r  is  defined  by  four  reals  giving  its  bottom  B(r),  top  T(r), 
left  L(r)  and  right  R(r)  extreme  points. 

3.1  On  the  SMM  with  N  Processors 

In  this  section  we  shall  present  an  algorithm  which  solves  the 

2 

rectangle  intersection  problem  in  time  0((logN)  +k)  on  a  SMI  with  N 
processors,  where  k  is  the  maxi mum  number  of  Intersections  per 
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rectangle.  We  shall  discuss  two  intermediate  problems:  intersection  of 
horizontal  and  vertical  line  segments,  and  two-dimensional  range 
searching . 

3.1.1  Intersection  of  Horizontal  and  Vertical  Line  Segments 

Given  a  set  V(0:n-1)  of  n  vertical  line  segments  and  a  set  H(0:m-1) 
of  m  horizontal  line  segments,  we  want  to  report  all  pairs  of  vertical  and 
horizontal  line  segments  which  intersect.  V(i)  and  H(i)  are  records. 

In  addition  to  the  endpoint  information,  each  V(i)  contains  two  redundant 
fields  B  and  T:  V(i)[B]  and  V(i)[T]  are  the  y-values  of  the  bottom  and  top 
endpoints  of  V(i),  respectively.  H(l)  also  contains  two  fields  L  and  R: 
H(i)[Ll  and  H(i)[R]  are  the  x-values  of  the  left  and  right  endpoints  of 
H(i),  respectively.  Let  Y(0:N-1)  be  a  sorted  array  of  distinct  y-values  of 
the  endpoints  of  the  vertical  line  segments,  where  N  ^  2n  (refer  to  Figure 
6(a)).  We  assume,  for  simplicity,  that  8+1  i3  a  power  of  2  and 
Y(N+1)  -  Y(N)+1;  the  details  of  the  general  case  are  straightforward. 

We  now  describe  the  search  tree  7  which  can  be  produced  for  the  set  of 
vertical  line  segments.  7  is  a  binary  tree  of  height  log(N+l).  In  7 
NODE  {  ( j )  denotes  the  J6*1  leftmost  node  at  height  i;  it  represents  an 
interval  [B1(j),I1(J)l  where  Bt(J)  -  Y(J  -21)  and  T^J)  -  Y((j+l)2i). 

If  i  >  0,  N0DIt(J)  has  two  sons:  N0DE1_l(2j)  and  N0DEi_l(2j+l) .  Each 
NODE^(J)  contains  a  list  of  edges  V(k)  sorted  in  the  positive  x-dlrection 
where  V(k)(B]  S  B^(j)  and  T^(J)  S  V(k)[T].  Moreover  V(k)  does  not  belong 
to  any  ancestor  of  NODE^(j).  Figure  6(b)  is  die  search  tree  7  for  the 
set  of  vertical  lines  in  Figure  6(a);  pairs  of  integers  in  the  circles 
are  values  of  J’21,  and  (j+,l)2^. 


Y(7)  a  10 
Y(6)  «  9 


(a)  A  set  of  vertical  line  segments  and  the  corresponding  Y  array. 

(the  "cuts"  on  the  edges  show  the  logarithmic  segmentation  for  I  and  6) 
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(b)  Search  tree  T  for  the  vertical  line  segments  in  (a) 
Figure  6.  Search  tree  T  for  vertical  line  segments. 
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We  define  u  a  list  o£  candidate  sagmants  for  NQDE^(j)  sorted 
in  the  positive  x-direction.  We  shall  construct  7  level  by  level 
beginning  from  the  root.  From  .  ■_  q,  which  is  a  list  of  all  the 

vertical  line  segments  sorted  in  the  positive  x-direetion,  we  extract 
segments  which  lie  in  the  range  tY(0),Y(N)].  This  list  of  extracted 
segments  is  associated  with  NODE^^^jj  (0) .  From  the  remaining  segments 

ln  Clog(H.l),0’  "*  0lcg(»H)-l,0  “d  °Iog(lH-l)-l,l  “  foU~*- 

Edg«  Clog(N+l)  ,0^  b,1“ss  “  Clog(»H).l,0  11 
Clog(UH),0<’')lBl  *  Tlog(M+l)-l(0)  *“*  t0  Clog«H-l)-l,l  l£ 

Clog(N+l),0(k)tT5  >  Blog(N+l)-l(1)*  W*  repeac  this  for  constructing 

the  set  of  NODE^j)  for  every  j  in  each  level  i.  Given  j,  all  of 

the  three  lists  HODE^(j),  Ci_^  2j  *®d  ^i_i  2J+1  can  be  determdned 
0(log|c.  , |)  steps  with  |c,  J  processors.  At  each  level  i,  every  line 

2^-1 

segment  can  belong  to  at  most  four  C.  . .  Therefore  £  ]c  |  <  4n. 

1  *  J  j  mQ  >  J 

Thus,  4n  processors  and  O(logn)  time  are  sufficient  to  construct  one 

level  of  7.  7  has  log(N+l)+l  levels,  so  7  can  be  constructed  in 
2 

0 ( (logN)  )  time  with  4n  processors  and  4n  memories.  The  following  program 
COKSTRUCT../1  constructs  7  for  vertical  line  segments.  (A  different  program 
CONSTRUCT-?^  will  be  written  to  construct  7  for  edges  of  a  planar  graph.) 

procedure  CONSTRUCT^!  (V) 

/*  construct  the  point  location  tree  7  for  the  vertical  line  segments  V  */ 
being  sort  V(0:n-1)  by  x-values  and  y-values  of  bottom  endpoints 
foreach  k,  0  <S  k  <  n,  do  Clog(w.1)>(J(k)  -  V(k) 

/*  construct  7  level  by  level  */ 

foreach  J,  0  £  j  <  2^-1  do 

begin  N0DEl(J)  -  Ct.lf2J  -  Ct_l>2j+1 

if  C  f  0  then 
J 


0 
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/*  de terrain*  NODE^(j)  by  extracting  the 
appropriate  edges  from  j  */ 
foreach  k,  0  £  k  <  |C^ do 

Sl*i£  Cl-1.2J(k)  "  ci-l,!jtl|k)  "  Cl,J<k) 


t(k)  -  0 

if  Ct  j(k)[Bl  S  Bt(j)  and 
\(i)  *  Ct  (k)[Tl 
then  t  (k)  *-  1 

end 

call  EXTRACTl(Ci  ^.t) 

“DEi«>  -  ci,j  ' 


/*  determine  and  Ci_1  2  by  extracting 

edges  from  the  remaining  of  j  */ 

foreach  k,  Oik<  1C.  ,  -.1  do 
-  1  i-'l.ZJ  — 

begin 

if  t-0  and  Ct.1>2:j(k)CBl  <T1_1(2j) 
then  t^  •-  1  else  t^  *-  0 
if  t-0  and  Ct.l(2j(k)[T]  >  8^(2  j+1) 
then  t2  •-  1  else  t2  •-  0 

end 

end 

call  EXTRACTKC^  2j»ci> 
call  EXTRACTl(Ci-1’2j+l,t2) 


To  find  all  the  intersections  of  a  horizontal  line  segment  H(k)  with  the 


set  V  of  vertical  line  segments ,  we  use  T  as  a  two-dimensional  binary  search 
tree:  At  a  selected  node  NODE^(j)  of  T,  we  report  all  the  vertical  segments 
in  the  list  of  NODE^(j)  which  are  in  the  interval  [H(k)[L] ,H(k)(R]] .  Since 
the  vertical  segments  at  NODE^(j)  are  sorted  by  their  x-values,  the  search 
can  be  done  in  O(logn+k' ) ,  where  k'  is  the  number  of  intersections  per 


segment  reported  in  one  level.  In  the  next  step,  we  proceed  to  one  son  or 
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both  of  NODE^Q)  by  comparing  the  y-value  of  H(k)  with  if 

y-value  of  H(k)  is  less  than,  greater  than  or  equal  to  Tj^(2j)  then 

we  proceed  respectively  to  the  left  son,  the  right  son  or  both  sons. 

At  the  selected  son,  we  again  report  all  the  vertical  line  segments  in 

the  list  of  this  node  which  intersect  with  the  horizontal  line  segment 

H(k) .  We  continue  this  process  until  we  reach  the  bottom  of  T .  Note 

that  the  y-value  of  H(k)  may  be  equal  to  only  one  T^_^(2j).  Thus,  we 

trace  a  unique  path,  possibly  two,  from  the  root  to  the  bottom  level;  at 

that  stage  all  intersections  k"  of  segment  H(k)  are  reported.  Since  T 

2 

is  of  height  O(logN),  this  process  runs  in  time  O((logn)  -He").  We  can 
find  intersections  of  all  m  horizontal  lines  with  V  simultaneously, 
provided  we  search  in  one  level  of  T  for  all  horizontal  lines  before 
going  to  the  next  level.  The  number  of  processors  required  is  m  for 
parallel  searching.  Thus,  we  have  the  following  result. 

Theorem  3.1.  All  Intersecting  pairs  of  n  vertical  line  segments  and  m 

2 

horizontal  line  segments  can  be  reported  in  time  O((logn)  -He)  on  a  SMM 
with  max(4n,m)  processors  and  max(4n,m)  memory  units,  where  k  is  the 
maximum  number  of  intersection  of  any  horizontal  line  segment  and  the  set 
of  vertical  line  segments . 

The  formal  description  of  the  intersection  algorithm  is  as  follows. 
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i 
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procedure  INTERSECT1(V,H) : 

/*  find  all  intersecting  pairs  of  horizontal  line  segments  In  H  and 
vertical  line  segments  in  V  */ 
begin 

/*  construct  the  point  location  tree  T  for  V  */ 
call  CONSTRUCT./!  (V) 

foreach  k,  0  S  k  <  a  do  begin  Jg(k)  **  0  ;  J^(k)  *■  -1  end 

/*  search  In  I  level  by  level  */ 
for  1  **  log(lffl)  downto  0  do 
for  p  0  to  1  do 

foreach  k,  Q  1  k  <  m  do 
if  Jp(k)  2  0  then 

begin  search  in  NODE^(j)  all  vertical  lines 

in  the  range  [H(k) [L] ,H(k) [R] ] 
if  y-values  of  H(k)  “  Ti_1(2Jp(k)) 


then  begin  J  (k)  -  2J  (k) 

- —  p  p 


(1)  - 


2J  (k)  +  1 


end 


else  if  y-value  of  H(k)  <  Tj^(2J  (k)) 


end 


then  J  (k)  -  2Jp(k) 

else  J  (k)  -  2J  (k) +  1 
-  P  P 


end 


3.1.2  Range  Searching 

We  are  given  a  set  S  of  n  points  in  the  plane  and  a  sec  Q  of  queries: 
report  all  points  of  S  in  the  range  Q(i)[L]  S  x  5  Q(i)[R]  and 
Q(i)[B]  Q(i)[T].  We  first  organize  the  points  in  S  so  that  we 

can  answer  the  queries  efficiently. 


(1), 


is  the  exclusive-or  operator. 


1 

I 

I 
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We  assume  Chat  Y(0:N-1)  is  a  sorted  array  of  the  distinct  y-values 
of  points  in  S,  where  N  ^  n.  We  also  assume  Chat  N  is  a  power  of  2.  We 
construct  a  search  tree  X  for  Che  set  of  points.  X  is  similar  to  but 
with  the  following  differences.  Associated  with  NODE^(j)  is  a  subset  of 
points  with  their  y-values  in  the  interval  [B^(j),T^(j)],  sorted  by  their 
x-values ,  where  ( j )  -  YCj‘21)  and  Tt(j)  -  Y((j+l)21-l).  Figure  7  is  an 
example  of  search  tree  X.  the  root>  is  the  entire  set  S 

sorted  by  x-values.  We  use  procedure  EXTRACTl  to  partition  N0DE^QgN(0)  into 
H3DEt_1(2j)  and  N0DEi_1(2 j+1)  such  that  all  points  in  NODE^  ^ (2 j >  have 
y-values  <  Ti_1(2j)  and  those  in  N0DEl_1(2 j+1)  have  y-values  *  Bi_1(2j>l). 
Again,  like  in  the  construction  of  T,  X  is  constructed  level  by  level. 

2^-1 

Since  £  |NODE,(j)|  *  a  for  all  i,  X  can  be  constructed  in  time  O((logn)^) 
j-0 

with  n  processors. 

procedure  CONSTRUCT_X(S  )  : 

/*  determine,  from  S,  NODE. (j)  of  X  */ 
begin  1 

sort  S(0:n-1)  by  their  x-values 

"“»log«<0>  -  S 

/*  determine  nodes  of  X  level  by  level  */ 
for  i  —  logN  downto  1  do  , 

foreach  j,  0  £  1  <  21  do 
begin 

/*  partition  points  of  NODE^ (j)  into  N0DEi_1(2j) 

and  N0DE^_^(2 j+1)  according  to  their  y-values  */ 

N0DEi-1(2j)  -  N0DEiiL(2j+l)  -  N0DE^(j ) 

foreach  a  €  NODE^(j)  do 

if  y-values  of  a  S  Bj^(2j) 

then  t^a)  -  1  ;  t^a)  -  0 

else  t2(a)  -  1  ;  t^a)  -  0 

call  EXTRACT l(N0DEt_1(2j),  t^ 

call  EXTRACTl (NODE.  . (2 j+1) ,  t,) 
end  1-1  z 


end 
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(a)  A  set  of  points  and  the  corresponding  Y  array 


(b)  Search  tree  X  for  points  in  (a) 


Figure  7.  Search  tree  X  for  points  in  the  plane. 
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Given  a  query  Q(k),  we  search  iaK  scarring  with  Che  root  unCil  we 

reach  a  NODE^j)  such  that  Q(k)[B]  S  Bi(j)  £  T^j)  <,  Q(k)[T].  Then  we 

report  all  points  in  NODE^Cj)  with  x-values  in  the  interval  [ Q (k) [L] ,Q(k) [R] ] . 

Since  points  in  NODE^(j)  are  ordered  by  their  x-values,  the  query  is 

2 

answered  in  O((logn)  +k')  time  with  1  processor  where  k'  is  the  number 
of  inclusions.  All  m  queries  can  be  treated  in  parallel  if  we  search  in 
one  level  of  3C  for  all  queries  at  a  time .  Therefore  we  have  the  following 
result  for  range  searching: 

Theorem  3.2.  The  two-dimensional  range  searching  problem  for  n  data 

2 

and  m  queries  can  be  solved  in  time  O((logn)  +k)  on  a  SMM  with 
max(n,m)  processors  and  memory  units,  where  k  is  the  maximum  number  of 
inclusions  per  query. 


irocedure  RANGE_SEARCH1(S  ,Q) 

/*  report  all  points  a  €  S  such  that  Q(i)[L]  S  x(a)  £  Q(i)[R]  and 
Q(i) [B]  S  y(a)  S  Q (i) [T] ,  for  every  Q(i)  €  0  */ 


/*•  construct  the  search  tree  3C  for  the  set  S  of  points  */ 

call  CONSTRUCTjC(S) 

foreach  k,  ki  0  <  ado  ^logN^ 

/*  search  in  X,  beginning  at  the  root  */ 
for  i  -  logN  downto  0  do 

foreach  k,  k  S  0  <  a  do 
begin  J^Ck)  *"  0 

for  each  j  €  J.,  (k)  do 

begin  if  Q(k)[B]  £  Bi(j)  and  T^j)  £  Q(k)[T] 

then  search  in  NODE^(j)  and  report  any 

pair  (Q(k),a)  where 

Q(k)[L]  S>  x(a)  S  Q(k)  [R] ,  a€N0DEi(j) 

else  begin  if  Q(k)[B]  S  Tt-l(2j) 
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end 


end 


end 


then  J1-L(k)  -  J1_1(k)  U  Uj} 
if  Q(k)[T]2Bl_1(2j+l) 

then  J^Ck)  -  J^Os)  U  t2j+l} 

end 


3.1.3  The  Rectangle  Intersection  Algorithm 

In  previous  subsections  of  this  section  ve  have  investigated  the 

rectangle  intersection  problem  in  a  top-down  fashion.  Procedure  RECTHITI(REC) 

is  the  complete  description  of  the  entire  algorithm  for  reporting  all  pairs  of 

intersections  of  rectangles  REC .  Another  two  programs  (RECTINT2  and  RECTINT3) 

will  be  written  for  the  CCC. 

procedure  RECTINT 1 (REC ) : 
begin 

V  -  all  vertical  edges  of  rectangles  in  REC 
H  -  all  horizontal  edges  of  rectangles  in  REC 
call  INTERSECTION H) 

S  <-  all  left  bottom  points  of  rectangles  in  REC 
Q  -  REC 

call  RANGE_SEARCH1(S,Q) 

end 


Combining  the  results  in  previous  subsections,  we  can  show  that  RECTINT1 
‘2 

runs  in  time  0((logN)  +k)  on  a  SMM  with  8N  processors  and  memories ,  where 
N  is  the  number  of  rectangles  and  k  is  the  maximum  number  of  Intersections 
per  rectangle.  However,  a  simple-minded  processor- time  tradeoff  can 
reduce  the  number  of  processors  to  N  by  increasing  the  time  by  a  factor  of  8 
as  follows.  We  can  position  the  sec  of  vertical  edges  into  eight  subsets, 
each  of  which  has  N/8  edges .  We  then  find  the  intersections  of  the  set  of 
horizontal  edges  with  each  of  these  eight  subsets  of  vertical  edges 
sequentially.  We  conclude  this  section  by  the  following  theorem. 
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Theorem  3.3.  Given  N  rectangles  with  edges  parallel  to  the  coordinate 

axes,  all  intersecting  pairs  of  these  rectangles  can  be  reported  in  time 
2 

0((logN)  +k)  on  a  SIM  with  N  processors  and  N  memories,  where  k  is  the 

maximum  number  of  intersections  per  rectangle. 

3.2  On  the  CCC  with  N  Processors 

In  this  section  we  shall  present  an  algorithm  which  solves  the 

2 

rectangle  intersection  problem  in  time  0((logN)  +k)  on  a  CCC  with  N 
processors,  where  k  is  the  maximum  number  of  intersections  per  rectangle. 

We  shall  first  discuss  three  intermediate  problems:  one-dimensional  range 
searching,  intersection  of  horizontal  and  vertical  line  segments,  and 
two-dimensional  range  searching. 

3.2.1  One-Dimensional  Range  Searching 

Given  a  set  A(0:N-1)  sorted  in  ascending  order  and  a  set  Q(0:M-1) 
of  queries  specified  by  two  bounds  [L]  and  [R]  (left  and  right  respectively) , 
we  want  to  report  all  elements  of  A  which  lie  in  the  range  [Q(i)[L], 

Q(i)  [R]  ]  0  <  i  <  M.  We  approach  this  problem  by  first  finding  AC^)  such 
that  A(ji-1)  <  Q(i)(L]  :£  A(J1),  for  each  i,  and  then  reporting  sequentially 
the  pairs  (Q(i)  .AC^)) ,  (Q(i)  ,A(Jt+l)) , . . . ,  (Q(i)  ,A( j±> )  where 
A^)  S  Q(i) [R]  <  A(Ji  +  l):  we  assume  that  Q  is  sorted  by  the  values  of  the 
left  bounds  in  ascending  order.  We  then  merge  A  and  Q.  We  perform  a 
parallel  search,  similar  to  the  one  Introduced  in  Section  2.2.3,  for 
determining  A^)  for  all  Q(i).  Before  reporting  any  inclusions,  we 
eliminate  those  queries  which  do  not  have  any  inclusion  (i.e.,  if 
Q(i)[R]  <  A(j^) )  from  further  consideration.  We  report  sequentially  all 
the  inclusions  for  every  query. 


36 


For  example,  consider  Che  case  where  N  »  7,  M  ■  4  and  Che  sequences 
of  A  and  Q  are  as  shown  in  Figure  8(a).  Figure  8(c)  is  Che  merged 
sequence  wich  Q (1)  eliminated  as  Q(1)[R]  »  4  <  A(3)  -  5,  i.e.,  none  of 
the  A's  lies  in  the  range  [Q(1)[L],Q(1)[R]].  We  then  start  to  report  all 
inclusions  by  looking  to  the  right  simultaneously  for  every  query: 

(Q(0),A(2)),  (Q(2),A(3))  and  (Q(3),A(5))  are  reported  first;  next 

(Q(0),A(3)),  and  (Q(2),A(4))  are  reported  at  the  same  time;  then 

(Q(0),A(4)),  (Q(0),A(5))  are  reported  one  at  a  time. 


)rocedure  RANGE_S EARCH_1D (A , Q ) 


/*  A(0:tT*l)  is  a  sorted  array,  Q(0:M-1)  is  a  sec  of  queries  sorted 
by  values  of  Q(i)[L].  Report  all  elements  of  A  which  lie  in 
[Q(i)[L],Q(i)[R]l  for  i  -  0 . M-l  */ 


begin 


/*  copy  information  of  A  and  Q  into  D  */ 
foreach  i,  0  5  i  <  M  do  begin  D(i)[cype]  -  query 

D(i) [key]  -  Q(i)[L] 
D(i)[k]  -  Q(i)[R] 
end  D(i) [value]  —  Q(i) 


foreach  i,  0  ^  i  <  M  do  begin  D(M*-1) [ type]  -  data 

D(Mfl)[key]  -  A(i) 

D(M+-i) [value]  ~  A(i) 

end 

apply  bi conic  merge  Co  D 

determine  P  such  Chat  P(i)  -  A(jt)  and  A(Ji«l)  <  Q(i)  *  A(jt) 

/*  eliminate  those  queries  which  do  not  have  inclusions  */ 
foreach  i,  Oi  l  <  lft-M  do 

if  D(i)[type]  •  query  and  D(i)[R]  <  P(i) 
then  t(i)  •“  0 
else  t(i)  -  1 
call  EXTSACT2  (D,C) 


I*  report  inclusions  */ 
foreach  i,  0  £  i  <  N+M  do 
begin  T(i)  -  null 

if  D(i)[type]  »  query  then  T(i)  *-  D(i) [value] 

end 

while  1  T(l)  #  null  do 
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Figure  8.  One- dimensional  range  searching. 


begin  £or  J  -  1  to  i  do  /*  i  .  loop  length  of  CCC  */ 
foreach  l,  OSl<  N+M  do 
begin 

If  1  nod  i  a  j-l  &  D(l)(type]  -  dtci 

thoa  tf  T(l) [R]  2  D(l) [value]  than 
report  (T(i),d(t)[ value]) 

•  lee  T(i)  -  mill 
T(i+1  mod  X)  -  T(i) 
end 

end  ~ 


All  steps  except  the  last  while  loop  clearly  require  at  most 
0(log(N+M))  steps.  The  evaluatioo  of  the  condition  of  the  while  loop  and 
step  in  this  loop  require  0(log(N+M))  time.  But  these  are  only  performed 
at  most  k/f  times,  where  k  is  the  maximum  number  of  inclusions  per  query 
and  l  is  the  loop  length  of  the  CCC,  which  is  of  order  log (N+M).  Therefore, 
the  time  complexity  of  the  while  loop  is  k.  Hence,  procedure  RANGE JSEARCH- ID 
runs  in  time  0(log(N+M)  +  k)  on  a  CCC  with  N+M  processors. 

Theorem  3.4.  Given  a  sorted  array  A(0:N-1)  and  a  set  Q(0:M-1)  of  queries 
sorted  by  values  of  the  left  bounds,  all  elements  of  A  which  lie  in  the 
range  [Q(i) [L] ,Q(i) [R] ] ,  for  i  ■  0,...,M-1,  can  be  found  in  time 
0(log(N+M)  +  k)  on  a  CCC  with  N+M  processors,  where  k  is  the  maximum  number 
of  inclusions  per  query. 

3.2.2  Intersection  of  Horizontal  and  Vertical  Line  Segments 

We  revisit  the  problem  of  reporting  intersecting  pairs  of  horizontal 
and  vertical  line  segments  as  Introduced  in  Section  3.1.1.  We  shall 
revise  procedure  INTERSECTl  so  that  it  will  be  suitable  for  implementation 
on  a  CCC  with  linear  number  of  processors.  Most  of  the  variables  used 
here  will  have  the  same  meanings  as  those  in  Section  3.1.1. 
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For  che  sec  V(0:n-1)  of  verdcel  line  segments,  we  construct  e 

search  scructure  8  which  consists  of  logM+1  arrays  Eq.E^ . SlogN* 

where  N  is  che  number  of  distinct  y -values  of  the  endpoints  of  Che 

segments  in  V.  Each  E^  is  a  selected  subset  of  vertical  line  segments  in  V. 

The  underlying  structure  of  8  is  a  binary  tree  similar  to  J  except  for 

the  indexing  of  the  nodes.  Instead  of  indexing  the  nodes,  in  some  level  i, 

from  left  to  right,  a  node  will  be  indexed  as  j  if  it  is  the  right  son  of 

NODEj^(j)  in  level  i+1  for  some  j  and  it  will  be  Indexed  as  if 

it  is  the  left  son  of  NODE^+^(j).  Therefore,  the  left  and  the  right  sons 

of  NODEi+1(j)  are  NODE^Cj)  and  N0DEi(2^o®N"i”1+ j)  respectively.  Suppose 

N0DE^+^(j)  is  the  k**  leftmost  node  in  level  i+1,  then  NODE^Cj)  represents 

the  interval  [Bl(j),Tt<j)]  -  [Y(2k«2l) ,Y((2k+  mSl,  and  N0DEi(2logN’i"1  + j) 

represents  the  interval  [Bi(2^0®N“i_1,  +  j) ,Ti(21°8N"‘i"1  +  j) ]  ■ 

[Y((2k+  1)2*) ,Y((2k+  2)2i)] .  The  left-to-right  sequence  of  the  node  indices 

at  any  level  of  8  is  the  bit-reversal  permutation  of  the  node  indices  at  the 

corresponding  level  of  J,  where  the  bit-reversal  permutation  maps  a  binary 

number  a  .a  ....a.  into  the  binary  number  a.a^...a  , .  Figure  9(a)  is  the 

n- 1  n— z  u  u  i  n- 1 

underlying  binary  tree  of  8  for  Che  vertical  line  segments  in  Figure  6(a). 

Note  that  Figure  9(a)  is  the  same  as  T  in  Figure  6(b)  except  for  the  node 
indices.  The  array  E^  of  8  is  the  concatenation  of  che  lists  of  vertical  line 
segments  associated  with  the  nodes  in  level  1  in  the  order  of  increasing  node 
indices.  We  also  associate  with  each  element  E^(j)  the  node  number  N#^(j)  such 
Chat  Ei(J)tB]  £  Bi(N#1(j))  and  El(J)[T]  2  T^Nrf^U))  and  Et(j)  does  belong  to 
any  ancestor  of  NODE^MI^CJ)) .  Therefore,  E^  is  a  selected  list  of  vertical 
line  segments  sorted  lexicographically  by  values  of  and  their  position  in 
the  positive  x  direction.  Figure  9(b)  shows  the  arrays  E^.EjiE^.Eq  for  the 
vertical  line  segments  in  Figure  6(a)  (null  elements  are  denoted  by  X). 


(a)  the  underlying  binary  tree. 


<b)  the  collection  of  arrays  E3,...,EQ 


Figure  9.  Search  structure  3  for  vertical  line  segments  in  Figure  6(a). 
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Construction  of  &  is  similar  to  that  of  Ti  the  arrays  are  constructed 
one  at  a  time: 


procedure  CO  NS  TRUCT_jS 1 (V ) 


/*  construct  the  search  structure  &  i.e.  Ei0gji»*'*»Eo  for  s#t 

V(0:N-1)  of  vertical  line  segments  */ 
begin  sort  V  by  x-values  and  then  y-values  of  the  bottom  endpoints. 
foreach  J,  Oi  j  <  n  do  begin  S  (j)  -  V(j) 

tt  (j)  *"  0  end 

foreach  J,  n  S  j  <  4n  do  S  (j)*"  null 


/*  determine  ElogN«---.E0 

for  i  *■  logN  dovnto  0  do 
begin 


one  at  a  time  */ 


/*  determine  by  extracting  edges  from  S  */ 

foreach  j,  0  S  j  <  4n  do 

begin  tx(J>  -  t2(j)  -  0 

Et(J)  -  S(j);  N»t(j)  -  tt(J) 

if  S(j)  +  null 

then  if  S(j)[B]  £  B^ttQ))  and 
Tt(TT(j))  S  S(j)[T] 
then  tx(j)  -  1 
«la«  t2(j)  -  1 

end 

call  EXTRACT2  (E^);  call  EXT8ACT2  (N#i ,  t^) 
call  EXTRACT2  (S,t2);  call  EXr8ACT2(TT, t2) 

/*  rearrange  the  order  of  elements  in  S  according  to 
their  node  numbers  in  the  next  level  */ 
foreach  j,  0  S  j  <  4n  do 
begin  TEMP(j)  -  S(j) 
tx(J)  -  t2(j)  -  0 

tf  S(j){Bl  <  T1-1(tt(J))  then  t^j)  -  1 

If  S  Cj)CTl  >  T  1(rr(j>)  then  begin 

t2(j)  -1 

TEMPrr(j)  -  21°8N"i-Hr(j ) 
end 
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call  EXTRACT2  (S ,  t^) ;  call  EXTRACT2(Tr,y1> 

call  EXTRACT2(TEKP,t2);  call  EXTRACT2(TEMPtt,  t2) 

foraach  j,  0  £  j  <  |TEMP|  do  begin  S(J+|Sl)  -  TEMP(j) 

tt(J+|s|)  -  TEMPrr(j) 

and 

end 

end 

Analysis  of  procedure  CONSTRUCT-#  is  similar  co  chat  of  CONSTRUCT  T. 

IC  is  easy  to  show  that  CONSTRUCT-#  can  be  Implemented  on  a  CCC  with  4n 

2 

processors  in  O((logn)  )  steps. 

To  find  intersecting  pairs,  we  use  i  as  a  binary  tree.  Ue  associate 
with  each  horizontal  line  segment  H(i)  a  node  number  NN(i)  indicating 
that  H(i)  may  intersect  some  vertical  line  in  node  NN(i).  We  start  at 
^logN  root)  *  ia  0bv*-0us  that  NN(i)  *  0  for  all  i  (there  is  only 

node  0  at  this  level) .  The  set  of  horizontal  lines  is  maintained  sorted 
lexicographically  by  their  node  numbers  and  x-values  of  their  left 
endpoints.  Since  is  sorted  in  the  same  manner,  we  can  use  the  one¬ 
dimensional  range  searching  algorithm  in  Section  3.2.1  to  report  all 
intersecting  pairs  at  level  i.  We  then  determine  which  node  in  the  next 
level  should  be  associated  with  each  horizontal  line  segment.  We 
continue  this  process  which  geometrically  traces  a  unique  path,  possibly 
two,  from  the  root  to  a  leaf.  Since  the  depth  of  #  is  logN+1,  this  process 
requires  O(logn.log(n+m)+k)  time  on  a  CCC  with  4n  +  2m  processors.  We 
now  present  formally  the  intersection  algorithm. 


procedure  INTERSECT2(V,H) : 


/*  search  all  intersecting  pairs  of  horizontal  line  segments  in  H 
and  vertical  line  segments  in  V  */ 
begin 


/*  construct  the  search  structures  E^0^,...,Eq  for  V  */ 
call  CONS TRUCT-# 1 (V ) 


3 


4 


/*  H' ,  the  set  of  horizontal  line  segments,  is  maintained  sorted 
lexicographically  by  their  node  number  and  x- values  of  their 
left  endpoints  */ 

sort  H  by  x-values  of  left  endpoints 
foreach  j,  Oi  j  <  ado  begin  H'  (j)  -  H(j) 

NN(j)  -  0;  end 

foreach  j,  mi  j  <  2m  do  H ' (J )  •"  null 


/*  search  in  S  beginning  at  E.  „  */ 
for  i  **  logN  downto  0  do  * 

begin  call  RANGE_3EARCH_10 (E^ ,H ' ) 


end 


end 


/*  determine  node  numbers  for  horizontal  line  segments 
to  be  used  in  the  next  level;  then  H*  is  reordered 
according  to  their  node  numbers  */ 
foreach  j,  0  S  j  <  2m  do 

begin  t1( j)  -  t2(j)  -  0 

TEMP(j)  -  H' (j) 
if  H'(j)  i4  null  then 

begin  if  y-value  of  H(j)  S  T^_^(NN(j)) 

then  t-(j)  -  1 

if  y-value  of  H(j)  ^  T1_1(NN(j)) 


end 

end 

call  EXTRACT2(H' , t^) 
call  EXTRACT2 (NN ,  t^) 
call  EXTRACT 2 (TEMP , t2 ) 
call  EXTRACT2(TEMPNN,t2) 


then  bet 


t2(J)  -  1 

TEMPNN  ( j  )*~2 


loSN-l^ 


end 


foreach  j,  0  S  j  <  |TEMP|  do 
begin  H'OH'l+j)  -  TEMP(j) 

NN(|H'  |+j)  -  TEMPNN(j) 


end 


Procedure  INTERSECT2  gives  the  following  theorem. 

Theorem  3.5.  All  intersecting  pairs  of  n  vertical  line  segments  and  m 

2 

horizontal  line  segments  can  be  reported  in  time  0((log(n+m))  +k)  on  a 
CCC  with  4n  +  2m  processors ,  where  k  is  the  maxinum  number  of 


intersections  per  vertical  line  segment. 
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3.2.3 


J-Dimensiotul  Range  Sear  chins 


We  now  investigate  the  two-dimensional  range  searching  problem  stated 
in  Section  3.1.2  on  a  CCC  with  linear  number  of  processors.  Again,  we 
assume  that  Y(0:N-1)  is  a  sorted  array  of  distinct  y-values  of  points  of 
S,  where  N  ^  n  and  N  is  a  power  of  2.  We  construct  a  search  structure  7 
which  consists  of  logN+1  arrays  ^ logN * F logN- 1 *  *  *  * ,F0*  The  und*r^yin8 
structure  of  7  is  a  binary  tree  similar  to  3C  (Section  3.1.2)  except  for  the 
indexing  of  the  nodes .  The  nodes  in  the  underlying  binary  tree  of  7  are 
indexed  in  the  same  manner  as  that  of  3  (Section  3.2.2).  Figure  10(a) 
shows  the  underlying  binary  tree  of  7  for  the  set  of  points  in  Figure  7(a). 
Note  that  Figure  10(a)  is  the  same  as  K  in  Figure  7(b)  except  for  the  node 
indices.  Suppose  that  N0DE^+^(j)  is  the  kC^  leftmost  node  in  level  i+1, 
then  its  right  son  NODE^j)  represents  the  interval  [B^(j)  ,T^(j)]  * 
[Y(2k-2i),Y(2k+l)2i)]  and  its  left  son  N0DEi(2logN'i_1 + j)  represents 
the  interval  [BJ_ (2lo8N“i_1  +  j)  ,Ti(21°8N"i'1  +  j)  ]  -  [Y( (2k+l)2i) , 
Y((2k+2)2i-l) ] .  Therefore,  F^  is  the  set  S  of  points  sorted  lexico¬ 
graphically  by  their  node  numbers  and  x-values.  At  level  i,  the  node 
number  of  Fi(k)  is  NN^k),  where  the  y-value  of  F^(k)  is  in  the  range 
[Bi(NNi(k)),Ti(NNi(k))] .  Figure  10(b)  shows  the  contents  of  Fj^  and  NN^ 
for  the  example  in  Figure  7(a).  The  construction  of  7  is  similar  to  3: 

The  set  S  of  points  is  first  sorted  by  their  x-values.  The  resulting  array 
is  F}.0gN •  We  then  determine  the  node  numbers  for  each  point  and 

rearrange  the  order  of  points  in  the  array  according  to  their  node  numbers. 

Since  the  cardinality  of  F^  is  n  for  all  i,  7  can  be  constructed  in  time 
2 

0((logn)  )  on  a  CCC  with  n  processors.  The  program  CONSTRUCT-7  fo*- 
constructing  7  is  presented  in  the  Appendix. 


NODE  (0) 


(a)  the  underlying  binary  tree. 


(b)  the  collection  of  arrays  FQ . 


Figure  10.  Search  structure  7  for  the  set  of  points  in  Figure  7(a). 
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To  answer  the  set  Q  of  queries,  we  search  in  7  for  each  k  until  we 
reach  level  i  such  that  Q(k)(B]  S  B^(j)  and  T^(j)  ^  Q(k)[T]  for  some  j. 

Then  we  perform  a  one-dimensional  range  search  to  report  all  the 
inclusions.  Since  we  may  visit  at  most  four  nodes  on  one  level  for  a 
particular  query,  4art-n  processors  are  sufficient.  We  use  the  result  in 
Section  3.2.1  for  one-dimensional  range  searching,  so  we  have  the  following 
result. 

Theorem  3.6.  The  two-dimensional  range  searching  problem  for  n  data  and  m 

2 

queries  can  be  solved  in  time  O((log(n+m))  +k)  on  a  CCC  with  n+4m 
processors,  where  k  is  Che  maximum  number  of  inclusions  per  query. 
procedure  RANGE_SEARCH2(S,Q) 

/*  report  all  points  a  €  S  such  that  Q(i)[L]  £  x(a)  ^  Q(i)[R] 
and  Q(i) [B]  S  y(a)  ^  Q(i)[T]  for  every  Q(i)  */ 
begin 

/*  construct  the  search  arrays  7:FiogN>  •  •  •  »fq  for  che  set  S 
call  CONSTRUCTS  (S) 


/*  Q*  is  the  set  Q  sorted  by  the  values  of  left  bounds  */ 
Q'  -  Q 

sort  Q'  by  Q' (i)[L] 

foreach  j,  0  i  j  <  n  d£  NN(j)  *“  0 

for  each  j,  m^j  <  4n  do  Q'(j)  •*  null 


/*  search  in  F, 


>F, 


logN, . 0 

for  i  •“  logN  downto  0  do 
begin 


one  at  a  time  */ 


/*  determine  Q"  which  is  a  subset  of  queries  that  can  be 
answered  at  this  level.  For  the  remaining  queries, 
determine  their  node  numbers  in  the  next  level  */ 
foreach  j,  0  i  j  <  4m  do 


begin  t^j)  -  c2(J)  - 


t3(J)  -  0 


Q"(J)  -  TEMP(J)  -  Q'(j)  lfl_w  , 

NN"(j)  -  NN(j);TEMPNN(j)  -  NN(j)  +2  8^'1‘ 

if  Q'(J)[B]  *  Bt(NN(j))  and  T^(NN(j)  £  Q'(j)[T]) 
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then  t^Cj)  - 
else  begin 


if 

if 


1 

Q'(j)tB]^Tt_1(NN(j))  then  t2(j) 

Q,(j)tT]^Bi_1(NN(j)+21°8N_i) 
then  t3(j)  -  1 


end 


end 

call  EXTRACT2(Q",t1);  call  EXTRACT2(NN",  t^ 


1 


/*  answer  queries  in  Q"  by  performing  a  one-dimensional 
range  searching  */ 
call  RANGE_SEARCH_lD(Fi  ,Q") 


end 


/*  extract  Q'-Q"  from  Q'  and  rearrange  the  order  according 
to  their  node  numbers  */ 
call  EXTRACT2(Q' ,t2);  call  EXTRACT2(NN, t2) 

call  EXTRACT2 (TEMP , tj ) ;  call  EXTRACT2 (TEMPNN , t3 ) 

foreach  j,  0  ^  j  <  (TEMPI  do 

begin  Q'(J  +  |Q'|)  -  TEMP(j) 

NN(  j +  |Q' |  -  TEMPNN (j ) 

end 

end 


3.2.4  The  Rectangle  Intersection  Algorithm 

The  rectangle  intersection  algorithm  for  a  CCC  is  the  same  as  that 
for  a  SMM  but  uses  different  algorithms  for  finding  the  intersections  of 
horizontal  and  vertical  line  segments  and  for  two-dimensional  range 
searching. 

procedure  RECTINT2  (REC): 
begin 

V  *-  all  vertical  edges  of  rectangles  in  REC 
H  -  all  horizontal  edges  of  rectangles  in  REC 
call  INTERS ECT2 (V , H) 

S  *-  all  left  bottom  endpoints  of  rectangles  in  REC 
Q  -  REC 

call  RANGE J3EARCH2 (S ,Q) 


end 
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Theorem  3.7.  Given  N  rectangles  with  edges  parallel  to  the  coordinate 
axes,  all  intersecting  pairs  o£  these  rectangles  can  be  reported  in 
tine  0(logN^  +  k)  on  a  CCC  with  N  processors,  where  k  is  the  maximum 
number  of  intersections  per  rectangle. 

Proof:  Combining  results  in  Sections  3.2.2  and  3.2.3,  we  use  some  simple 

processor*- time  tradeoffs  similar  to  the  one  used  in  the  previous  section 

2 

to  achieve  the  time  complexity  of  0((logN)  +k)  and  processor  complexity 
of  N.  □ 

3.3  On  the  CCC  with  N1+tt  Processors 

In  this  section  we  shall  develop  an  algorithm  for  reporting  intersecting 
pairs  of  N  rectangles  for  a  CCC  with  super linear  number  of  processors.  This 

1.  +  Qt 

algorithm  can  be  implemented  in  0(—  logN  +  k)  time  requiring  N  processors, 
where  0  <  a  ^  1  and  k  is  the  maxisum  number  of  intersections  per  rectangle. 
3.3.1  Intersection  of  Horizontal  and  Vertical  Line  Segments 
As  in  the  algorithms  developed  for  a  CCC  with  N  processors,  we  construct 
a  search  structure  b  for  the  sec  V(0:  n-1)  of  vertical  line  segments  so  that 
Che  intersections  of  horizontal  line  segments  in  H(0:  m-1)  and  V(0:  n-1) 
can  be  found  efficiently.  Let  N  be  the  number  of  distinct  y-values  of  the 
endpoints  of  V.  &  consists  of  j-t-1  arrays  D  l/a,Dl/a-l’  *  *  *  ,D0'  Each  Di  is 

a  selected  subset  of  V  sorted  lexicographically  by  their  node  number 

(as  defined  in  Section  3.2.2)  and  their  positions  in  the  positive  x 

direction.  The  underlying  geometric  structure  of  &  is  a  Na-ary  cree  of 
X 

height  there  are  N  nodes  at  height  i,  indexed  as  follows.  At  level 
the  root  is  indexed  0.  Node  j  which  is  the  kC^  leftmost  node  at  level  i 
has  Na  sons  at  level  i-1;  they  are  nodes  j,  N^-iar  +  j,  2N^-i,0f  +  j , . . . , 
(Naf-l)N1"ia  + j  representing  respectively  the  intervals 
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[Y(kN<V'a~0')  .YCCk^+DN1®"®)  ] ,  [Y(1cNa+l)Nia"fl‘)  ,Y((kNa+2)Ni°'*a)  1  » 

[Y((kN“+2)Nia“a),Y(kNfl<+3)Nia’a)] , . . .  jY((kN“«“-l)N^"a),Y(krf*4«af)Nio(‘a)]  . 
Figure  11  shows  an  example  with  N  ■  16,  a  »  %.  Figure  11(b)  is  the 
underlying  Na-ary  tree;  pairs  of  integers  in  the  circles  ere  values  of 
B^(j)  end  T^(j),  end  the  integers  above  the  circles  ere  node  numbers. 

The  construction  of  arrays  . Dq  runs  as  follows.  Initially, 

the  node  number  of  each  vertical  line  segment  is  0.  Let  S  be  the  set  V  of 
vertical  line  segments  sorted  lexicographically  by  their  node  numbers, 
x-values,  and  y-values  of  bottom  endpoints.  We  extract  from  S  all  the 
segments  which  cover  the  range  [Y(0),Y(M>]  and  form  the  set  D^y^.  After 
the  extraction,  the  remaining  elements  of  S  are  duplicated  N^-l  times.  Then 
we  determine  to  which  of  the  N**  subtrees  we  should  branch  for  each  vertical 
line  segment,  that  is,  we  determine  the  node  numbers  for  the  remaining 
elements  of  S  in  the  next  level  as  follows.  We  branch  to  the  leftmost 
subtree  if  the  y-value  of  one  or  both  endpoints  of  the  vertical  line  segment 


is  in  the  range  [B1ya_1(0)  ,T^ya_^(0)]  branch  to  the  second  leftmost  sub¬ 
tree  if  it  is  in  range  ^ '  and  so  °a‘  We  t*ietl  repeat 

the  process  until  all  arrays  of  &  are  determined.  Let  us  analyze  the 
time  and  number  of  processors  required.  At  each  iteration  1,  a  vertical 
line  segment  may  appear  at  most  2N®  times  in  S.  After  the  extraction  of  D^, 
S  contains  at  most  2n  elements.  Then  the  elements  of  S  are  replicated  into 
N°*  copies.  Therefore,  at  any  time,  the  maximum  number  of  elements  in  S  is 
2nNaf  <  4n*+a.  Since  data  extraction  and  replication  can  be  done  in  time 


0,16)  0 


search  arrays 
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O(logn)  on  a  CCC  with  a  number  of  processors  linear  in  Che  problem  size 
and  &  contains  —  +  1  arrays,  $  can  be  determined  in  time  0(^  logn) 

Of  Of 

1  .  -y 

with  4n  processors.  We  now  present  formally  the  construction 
algorithm  which  we  just  described. 

procedure  CONS  TRUC T_^l  ( V ) 

/*  construct  the  arrays  Di/a»Di/a_i»  •  •  *»Do  *or  th*  Mt  v  vertical 

line  segments  */ 
begin 


/*  maintain  S  as  an  array  of  vertical  line  segments  sorted 
lexicographically  by  their  node  numbers  and  their 
x-coordinates  */ 

sort  V  by  x- values  and  then  y-valuea  of  bottom  endpoints 
foreach  j,  OS  j  <  n  do  begin  S(j)  *"  V(j);  tt(J)  *-  0  end 

for  each  j ,  n  ^  j  <  2nN®  do  S  ( j )  •-  null 

/*  D^/a,...,Dg  are  constructed  one  by  one  in  descending  order  */ 


for  i  -  —  downto  0  do 


/*  for  each  vertical  line  segment  of  S ,  determine  if 
it  belongs  to  some  node  at  this  level;  extract 
those  which  do  and  assign  them  to  */ 

foreach  j,  0  S  j  <  2nN®  do 

begin  tx(J)  -  C2(J)  -  0;  D£(J)  -  S(j);N#t(j)  -  tt(J) 

if  S(j)  i  null 

then  if  S  (J)  [Bl  ^ B^(TT(j) )  and 

T.CnCj))  £  S(j)[T] 

then  t^(J )  -  1 
«!»•  t2(j)  -  1 
end 

call  EXTSACT2(Di,c1);  call  EXTRACT2 , t^) 

/*  for  the  remaining  of  S,  determine  their  node  numbers 
for  the  next  level;  and  reorder  them  according  to 
their  node  numbers  */ 

call  EXTRACT2 (S . t„ ) ;  call  EXTRACT2 (n , t, ) 

for  h"* log2n  to  log2N®-l  do  /*  duplicate  N®  times  */ 
for  1.  OS  J  <  2011®  do 
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if  BITk(j)  -0  then  begin  S(j  +2k)  -  S(j) 

"(j  +2k)  -  rr(j) 

end 

foraach  J,  0^J<2nl^  do  /*  determine  node  numbers  */ 

begin  n(j)  -  tt(J)  +  U /2tU  N1*1® 
t(j)  -  0 

if  S(j)*null  and  <S (j) [B]  <  T^Cna))  or 

S(j)[T]>B  .(n(j))) 

then  t(j)  ~  1  1  1 

end 

call  EXTRACT2(S,t);  call  EXTKACT2(tt,c)  /*  reordering  */ 

end 

end 

Searching  in  3  for  all  intersecting  pairs  of  horizontal  and  vertical 
line  segments  is  the  same  as  searching  In  3  except  we  have  to  choose  one, 
possibly  two,  out  of  Na  branches  at  one  level  of  3  for  each  horizontal 
line.  The  procedure  INTEHSECT3  to  be  presented  in  the  Appendix  can  be 
implemented  on  a  CCC  with  4n^  +  c*  +  2mNa  processors  in  (j  log(n  +  m)+k) 
parallel  steps,  where  k  is  the  maximum  number  of  intersections  per  vertical 
line  segment.  We  state  this  result  in  the  following  theorem. 

Theorem  3.8.  All  intersecting  pairs  of  n  vertical  line  segments  and 
m  horizontal  line  segments  can  be  reported  in  time  0(^  log(n  +  m)+k)  on  a 
CCC  with  4(n  +  m)na  processors,  0  <  or  S  1»  where  k  is  the  maximum  number  of 
Intersections  per  vertical  line  segment. 

3.3.2  Two-Dimensional  Range  Searching 

For  the  two-dimensional  range  searching  problem,  we  arrange  Che  set  S 
of  points  into  the  data  structure  (similar  to  3) ,  so  that  the  set  Q  of 
queries  can  be  answered  efficiently.  In  J>,  G^^,  •  •  •  »Gq  are  arrays  of 
points  in  S .  The  points  in  array  G^  are  ordered  by  their  node  numbers  at 
level  i  and  x-values.  The  node  number,  at  level  i,  of  a  point  is  j  if 


54 


•  tj, 

its  y-value  is  in  the  range  [B^ (j) ,T^(j) ] .  Node  j  which  is  the  k  (for 
some  k)  leftmost  node  in  level  i  has  N®  sons  at  level  i-1;  they  are 
nodes  j,  N*"1'®  +  j, . . (N®-1)N1”1‘® +  J  representing  respectively  the 
intervals  [B^CJ) .T^U)]  -  tY(krfV-®"®) ,Y((kNa+l)Nia“0f-l) ] , 

tB1.1(N1’tQ(  +  j) »T1.1(N1‘i“  +  j)  1  -  [Y( (kN®  +  DN1®"®) ,Y((kN®  +  2)/®"®-!)  1 , 

[Bi.1(0^-l)N1“i®  + J)»Tt.i((!^-l)N1*i®  + j)]  -  [Y((kN®  +  jf-l)Ni®‘®), 

Y((kN®  +  N®)N1®"®-1) ] . 

Figure  12  is  an  example  of  a  set  of  20  points  and  the  corresponding 

data  structure  with  N  ■  16  and  at  m  Figure  12(b)  is  the  underlying 

N®-ary  tree;  the  pairs  of  integers  in  the  circles  are  values  of  B^(j) 

and  T^(j),  and  the  integer  above  the  circles  are  node  numbers. 

The  construction  of  J>  is  similar  to  that  of  Since  the  cardinality 

*  1 

of  is  n  for  all  i,  J>  can  be  constructed  in  time  0(—  logn)  on  a  CCC 
with  nN®  processors.  The  program  CONSTRUCT-^  for  constructing  J>  will  be 
presented  in  the  Appendix. 

Given  a  set  Q  of  m  queries,  we  search  in  J>  until  we  reach  a  node  j  such 
that  Q(k)[B]  5  B^ (J >  and  T^(j)  2S  Q(k)[T].  Then  we  perform  a  one-dimensional 
range  searching  on  G^.  We  may  have  to  search  at  most  2N®  nodes  at  one 
particular  level  for  a  particular  query.  Therefore,  we  may  need  at  most 
2N®m  +  nN®  processors.  The  analysis  of  time  complexity  of  this  range 
searching  is  straightforward. 

Theorem  3.9.  The  two-dimensional  range  searching  problem  for  n  data  and 
m  queries  can  be  solved  in  time  0(“  log(n  +  m) +k)  on  a  CCC  with  2(n  +  4m)n“ 
processors  where  0  <  at  ^  1  and  k  is  the  maximum  number  of  inclusions  per 


query. 


search  structure 
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The  program  RANGE_S EARCH3  will  be  presented  in  the  Appendix. 

3.3.3  The  Rectangle  Intersection  Algorithm 

The  rectangle  intersection  algorithm  for  a  CCC  with  superlinear 
number  of  processors  uses  results  in  Sections  3.3.1  and  3.3.2.  The 

1  I  +rv 

running  time  is  0(—  logN-Hc)  and  the  number  of  processors  is  ION  . 
o 

procedure  RECTI NT3 (SEC) : 
begin 

V  **  all  vertical  edges  of  rectangles  in  SEC 
H  all  horizontal  edges  of  rectangles  in  REC 
call  INTERS ECT3(V,H)  , 

S  -  all  leftmost  bottom  points  of  REC 
Q  -  REC 

call  RANGE_SEARCH3 (S ,Q) 

end 

Ue  can  use  some  processor -time  tradeoffs  similar  to  the  one  used  in 


Section  3.1.3  to  obtain  the  following  results. 


Theorem  3.10.  Given  N  rectangles  with  edges  parallel  to  the  coordinate 

axes ,  all  intersecting  pairs  of  these  rectangles  can  be  reported  in 
1  1  ^ 

time  0(—  logN+k)  on  a  CCC  with  N  processors,  0  <  a  S  l,  where  k  is 


the  maximum  number  of  intersections  per  rectangle. 
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CHAPTER  4 

PLANAR  POINT  LOCATION 

The  problem  of  planer  point  location  is  stated  as  follows:  given  a 
planar  graph  embedded  in  the  plane  as  a  straight  line  graph  [21]  G  with 
N  vertices  and  a  point  P,  find  the  region  of  the  planar  subdivision 
induced  by  G  which  contains  P.  This  problem  is  quite  important  in 
computational  geometry.  We  shall  show  in  later  sections  how  it  can  be 
applied  to  solve  other  problems.  A  recent  and  practical  result  for  serial 
computation  on  this  problem  is  due  to  Preparata  [28].  His  algorithm  runs 
in  O(logN)  time  on  a  data  structure  which  can  be  constructed  in  O(NlogN) 
time. 

Many  times ,  point  locations  are  performed  repeatedly  on  the  same 
graph;  therefore,  it  is  beneficial  to  arrange  the  given  graph  into  an 
organized  structure  to  facilitate  searching.  Furthermore,  very  often, 
these  searches  are  independent  and  can  be  performed  simultaneously. 

In  this  chapter  we  preprocess  the  given  graph  G  ■  (V,E)  so  that  we  can 
locate  M  points  simultaneously  on  the  SMM  and  on  the  CCC.  V(0:  N-l)  is 
the  set  of  vertices  and  E(0:|e|-1)  is  an  array  of  records  containing 
information  about  each  edge:  its  two  endpoints  and  the  regions  lying  on 
either  side  of  it  (left  and  right).  We  shall  assume  that  Y(0:  N-l)  is 
the  sorted  array  of  distinct  y-values  of  V  and  N  is  a  power  of  2. 

Figure  13  shows  a  planar  straight-line  graph  with  20  vertices  and  16 
distinct  y-values,  i.e.,  N  ■  16. 
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4.1  On  the  SMM  with  max(N.M)  Processors 

In  this  seccion  we  describe  ewo  algorithms:  (i)  the  construction  o£ 

a  search  structure  for  the  set  of  edges  on  the  SMM  with  H  processors  and 

(11)  the  concurrent  location  of  M  points  with  M  processors .  The 

2 

construction  and  the  location  run  in  time  0((logN)  loglogN)  and 
2 

0((logN)  )  respectively. 

4.1.1  Definition  and  Construction  of  the  Point  Location  Tree 
Recall  the  search  tree  T  introduced  in  Section  3.1.1.  We  can  produce 
I ,  for  the  set  of  edges  of  the  given  graph,  G,  which  will  be  referred  to 
as  the  point  location  tree  for  G.  Figure  14  gives  the  point  location  tree 
for  the  graph  in  Figure  13.  Recall  that  the  initial  step  of  the  procedure 
CONS TRUCT-JI  developed  in  Section  3.1.1  is  to  obtain  an  ordering  of  the 
set  E(0:|e|-1)  of  edges  such  that  if  E(i)  is  the  left  of  E(j)  then  E(i) 
procedes  E(j)  in  the  ordering.  Unfortunately,  there  is  no  known  efficient 
parallel  algorithm  for  topological  sorting.  Therefore,  we  cannot  use  the 
same  procedure  C0NSTRUCT_J”1  to  produce  the  point  location  tree  T  for  the 
edges.  Since  the  list  associated  with  node  NODE^(j)  consists  of  edges 
which  span  the  same  y-interval  [B^ (j) ,T^ (j) ] ,  these  edges  are  comparable, 
that  is,  every  edge  is  either  to  the  left  or  to  the  right  of  another  edge 
in  Che  same  list.  We  can  sort  the  edges  in  the  lists  associated  with  each 
node  after  the  members  of  the  lists  have  been  determined.  Since  each  node 
contains  at  most  |e|  edges  (|e(  <  3N)  and  each  edge  is  contained  in  at 
most  two  nodes  at  any  one  level,  we  can  sort  the  edges  in  every  node  at 
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level  In  time  O(logNloglogN)  using  N  processors.  Again  we  conscruct  Z 
level  by  level  beginning  from  Che  root.  The  procedure  CONS TRUCT-J^ , 
which  will  be  presented  in  che  appendix,  for  the  set  of  edges  is  the 
same  as  CONSTRUCL-Tl  for  the  set  of  vertical  line  segments  except  we 
do  not  initially  order  the  edges  in  the  entire  set. 

4.1.2  Point  Location 

To  locate  a  point  P(k)  in  the  planar  subdivision  induced  by  G,  we 
use  Z  as  a  binary  search  tree.  We  define  two  "dummy"  vertical  edges 
E^  and  E^  of  infinite  length  which  are  at  negative  and  positive  infinity 
respectively.  Associated  with  P(k),  we  determine  a  pair  of  edges  L(k)  and 
R(k)  of  E  which  bound  P(k)  on  the  left  and  on  the  right  respectively. 
Initially,  we  set  L(k)  and  R(k)  to  E  _  and  E  ,  respectively.  We  search  Z 
until  L(k)  and  R(k)  bound  the  same  region:  at  a  selected  node  NODE^(j) 
of  J  where  the  edges  form  an  ordered  set  we  perform  a  binary  search,  for 
an  edge  immediately  to  the  left  (right)  of  ?(k),  compare  this  edge  with 
L(k)  (R(k));  the  one  closer  to  P(k)  is  the  new  value  of  L(k)  (R(k)).  If 
L(k)  and  R(k)  bound  the  same  region,  P(k)  is  in  this  region:  otherwise, 
we  have  to  choose  a  branch  or  both  by  comparing  the  y-value  of  P(k) 
with  Tj^(j):  if  it  is  less  than,  greater  than  or  equal  to  ^C2. J )  then 

we  branch  respectively  to  the  left,  the  right  or  both  branches  (refer  to 
Figure  15).  Note  that  the  y-value  of  P(k)  may  be  equal  to  only  one  T^_^(2j). 
Thus,  we  trace  a  unique  path,  possibly  two  (when  the  y-value  of  P(k)  is 
equal  to  some  Ti_^(2j)),  from  the  root  to  (at  most)  the  bottom  level  of  Z. 
Since  Z  is  of  height  logN  + 1  and  the  edges  in  each  node  are  sorted,  this 
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process  runs  in  time  0((logN)  ).  We  can  locate  all  M  points  simultaneously 

provided  we  search  in  one  level  of  T  for  all  points  before  going  to  the 

next  level.  The  number  of  processors  required  is  H  for  parallel  searching. 

We  shall  present  the  formal  description  LOCATE 1  in  the  appendix. 

We  conclude  this  section  by  the  following  theorem. 

Theorem  4.1.  Given  a  planar  straight  line  graph  with  N  vertices,  we  can 

locate  M  points  in  the  planar  subdivision  induced  by  the  graph  in  time 
2  2 

0((logN)  )  with  O(ClogN)  loglogN)  preprocessing  time  on  a  SMM  with 
max(N,M)  processors  and  memory  units. 

4.2  On  the  CCC  with  N  +  M  Processors 

In  this  section  we  revisit  the  problem  of  planar  point  location  as 
discussed  in  Section  4.1.  We  shall  revise  procedure  LOCATE 1  so  chat  it 
will  be  suitable  for  implementation  on  a  CCC  with  linear  number  of 
processors . 

4.2.1  Construction  of  the  Search  Structure 

In  Section  3.2.2,  we  construct  a  search  structure  (a  set  of  arrays 
Eq,E1,  . . . .E^Qgjj)  f°r  a  set  vertical  line  segments.  We  can  produce  the 
same  structure  6  for  the  set  of  edges.  Figure  16(a)  is  the  underlying 
binary  tree  of  &  for  the  graph  in  Figure  13 .  Not e  that  this  tree  is  the 
same  as  the  point  location  tree  in  Figure  14  except  for  the  node  indices. 
Figure  16(b)  shows  the  collection  of  arrays  E^,...,EQ  and  the 
corresponding  node  number  of  edges. 

As  discussed  in  Section  4.1.1,  it  is  relatively  time-consuming  to 
obtain  initially  a  total  ordering  of  the  edges.  Thus,  we  first  determine 
the  edges  in  E^  then  sort  them  lexicographically  by  their  node  numbers 
and  their  positions  in  the  positive  x  direction.  We  can  develop  a 


Figure 
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procedure  CONSTRUCT-^ 2  for  producing  3  for  Che  sec  of  edges  which  will  be 

che  same  es  procedure  C0NSTRUCT_^1  in  Secdon  3.2.2  for  a  sec  of  verdcal 

line  segmencs  except  in  C0NSTRUCT_i2  we  do  noc  inidally  order  Che  endre 

sec  of  edges,  buc  order  che  edges  in  each  separacely.  Since  Che 

cardinaliey  of  each  is  ac  most  2|e|  (|e|  <  3N),  we  can  easily  verify 

3 

chac  che  procedure  C0NSTRUCT-J2  in  che  appendix  runs  in  time  0((logN)  ) 
on  a  CCC  with  N  processors. 

4.2.2  Point  Locadon 

As  a  preliminary  seep,  we  sore  Che  sec  ?(0:  M-l)  points  Co  be 
located  by  cheir  x-coordinaCes .  Like  point  locadon  on  a  SMM  in  Section 
4.1.2,  for  each  point  P(k),  we  search  in  3  until  che  two  edges  L(k)  and 
R(k)  bound  che  same  region.  We  associate  with  each  point  P(k)  a  node 
number  NN(k)  indicadng  Chat  che  y-coordinate  of  P(k)  is  in  che  range 
[B^v'MKkJJ.T^NNCk))]  at  some  level  i.  We  start  at  ElogN  (the  root). 

Ic  is  obvious  that  NN(k)  is  equal  to  0  for  all  k  at  the  root.  The  set 

of  points  is  maintained  sorted  lexicographically  by  their  node  numbers 
NN(k)  and  their  x-coordinaces.  Since  is  sorted  in  the  same  manner,  we 

can  use  che  parallel  searching  algorithm  in  Secdon  2.2.3  to  determine 
the  pairs  of  edges  L(k)  and  R(k).  If  L(k)  and  R(k)  do  not  bound  the  same 

region,  we  have  to  determine  which  node  in  the  next  level  of  3  we  should 

condnue  to  search.  This  process  plctorlally  traces,  in  the  underlying 
binary  search  tree  of  J,  a  unique  path,  possibly  two,  for  each  point, 
from  the  root  to  Che  bottom  level.  Since  the  parallel  searching  at  each 
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level  requires  0(log(N  +  M))  doe  and  3  has  logN  +  1  levels,  Che  point 
location  described  above  runs  in  time  0(log(N +  M)logN)  on  a  CCC  with 
N  +  M  processors.  We  present  the  formal  point  nation  procedure 
L0CATE2  in  the  appendix. 

Procedure  L0CATE2  gives  us  the  following  theorem. 

Theorem  4.2.  Given  a  planar  straight-line  graph  with  N  vertices,  we  can 
locate  M  points  in  the  planar  subdivision  induced  by  the  graph  in  time 
0((log(N  +  M))2)  with  0((logN)^)  preprocessing  time  on  a  CCC  with  N+M 
processors . 

4.3  On  the  CCC  with  (N+M)*’*8  Processors 

In  this  section  we  investigate  the  problem  of  point  location  on  a 
CCC  with  (N  +  M)^  +  0f  processors,  where  N  is  the  number  of  vertices  of  a 
given  graph,  M  is  the  number  of  points  to  be  located,  and  0  <  a  £  1. 

4.3.1  Definition  and  Construction  of  the  Search  Structure 
Recall  the  search  structure  &  we  constructed  for  a  set  of  vertical 
line  segments  in  the  algorithm  for  reporting  intersection  of  vertical  and 
horizontal  line  segments  (Section  3.3.1).  The  underlying  geometric 
structure  of  &  is  a  Na-ary  tree  of  height  j  (refer  to  Figure  18). 

Figure  17  shows  the  same  planar  straight  line  graph  as  in  Figure  13  but 
with  different  edge  segmentation.  We  can  produce  the  same  structure  & 
for  the  set  of  edges.  &  will  consist  of  j  +  1  arrays  D^yra,...,DQ,  each 
of  which  is  a  selected  subset  of  edges  sorted  lexicographically  by  their 
node  numbers  and  their  positions  in  the  positive  x  direction.  Here  again, 
for  well  known  reasons,  we  first  determine  the  edges  in  and  then 


sore  eham.  By  the  same  argument  as  In  Section  3.3.1,  each  0^  contains 

|  ^  jy  1  2 

at  moat  2N  edges.  Therefore  b  can  be  constructed  in  time  Q(— (logN)) 

1  +  a 

on  a  CCC  with  N  processors.  The  procedure  C0NSTRUCT_52  which  will  be 
presented  in  the  appendix  for  the  set  of  edges  is  similar  to  the  procedure 
CONSTRUCT-51  for  a  set  of  vertical  line  segments  with  the  following 
difference.  In  procedure  CONSTRUCT— 52,  we  do  not  initially  order  the 
entire  set  of  edges  but  we  determine  the  members  of  each  before  we 
order  them. 

4.3.2  Point  Location 

Point  location  &  is  the  same  as  point  location  in  3  except  we  have  to 
choose  one,  possibly  two,  out  of  vf*  branches  at  any  level  of  Jfr  for  each 
point.  The  procedure  L0CATE3  to  be  presented  in  the  appendix,  can  be 
implemented  on  a  CCC  with  (N  +  M)1  +  Qf  processors  in  0(-(log(N  +  M))2) 
parallel  steps.  We  state  this  in  the  following  theorem. 

Theorem  4.3.  Given  a  planar  straight  line  graph  G  with  N  vertices,  we 
can  locate  M  points  in  the  planar  subdivision  induced  by  G  in  time 
0(^  log(N+M))  with  0(^(log(N  +  M))2)  processing  time  on  a  CCC  with 
(N+M)^+af  processors. 
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CHAPTER  5 

CONVEX  HULLS  OF  SETS  OF  POINTS  IN  TWO  DIMENSIONS 

Formally,  the  convex  hull  of  a  finite  set  S  of  points  Is  the 
intersection  of  all  convex  sets  containing  S.  In  the  plane,  the  convex 
hull  of  S,  CH(S),  is  a  convex  polygon.  Specifying  a  polygon  unambiguously 
requires  giving  its  vertices  in  the  order  that  they  occur  on  the  boundary. 

A  simple  polygon  is  in  standard  form  if  its  vertices  occur  in  clockwise 
order  with  all  vertices  distinct  and  no  three  consecutive  vertices  collinear, 
beginning  with  the  vertex  that  has  largest  y-coordinate. 

The  problem  of  convex  hulls  arises  in  many  applications:  finding 
diameter  of  a  set,  determining  the  existence  of  a  linear  classifier  of  a 
set,  etc.  Several  optimal  algorithms  for  determining  sequentially  the 
convex  hull  of  a  set  of  N  points  in  two  dimensions  have  been  developed 
[2,9,30,351.  These  algorithms  use  the  well-known  technique  called  "divide 
and  conquer"  [1]  and  achieve  the  running  time  of  O(NlogN).  In  a  parallel 
machine,  the  subproblems  generated  by  the  "divide  and  conquer"  method  can 
be  solved  simultaneously,  so  an  efficient  algorithm  for  combining  the 
results  of  these  subproblems  is  essential  for  an  overall  fast  parallel 
algorithm.  We  shall  develop  some  preliminaries  before  designing  convex 
hulls  algorithms  on  the  SMM  and  on  the  CCC. 

5.1  Preliminaries 

Given  a  convex  polygon  A(0:  n-1)  in  standard  form,  let  and 

be  the  indices ^  of  the  vertices  with  least  x  coordinate,  least  y 
coordinates  and  largest  x  coordinate  respectively.  Given  two  points  p 

^Indices  of  polygon  A(0:n-1)  are  modulo  n. 
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and  q  In  Che  plane,  6(p,q)  denotes  Che  polar  angle  of  q  with  p  as  Che 

origin.  We  define  a.  *9(A(i),A(j)){'^ .  Due  to  convexity,  in  the  range 

1 » J 

Oii<  n-1,  the  sequence  . . .  ,at^  i+^,...)  is  decreasing. 

Let  A(0:  n-1)  and  B(0:  m-1)  be  two  convex  polygons  where  the 
y-coordinate  of  A(i)  is  less  than  that  of  B(j)>  for  0  <,  i  <  n  and 

(2) 

0  £  j  <  m,  so  A  and  B  are  non-intersecting.  We  define  y.  •  9  (A(i)  ,B(j))  . 

l»j 

A  sequence  is  V-bitonic  if  it  consists  of  a  decreasing  sequence,  which  may 
be  empty,  followed  by  an  Increasing  sequence.  A  sequence  is  A-bl tonic  if  it 
consists  of  an  increasing  sequence,  which  may  be  empty,  followed  by  a 
decreasing  sequence.  Due  to  convexity,  in  the  range  0  S  i  <  s^  the  sequence 

(V.  n,y.  . ,...,y.  )  is  V-bitonic  and  in  the  range  s.  S  i  <  n  the  sequence 

B  (t) 

(y.  y  T,..., y.  )  is  A-bitonic  (refer  to  Figure  19).  We  define  j 

i,sB,  i.sB+l  t.m 

as  min  £  Yik>  fori,  OSiS^andaj 

min  Cj  |y^j  S  Yik»*B  S  k  i  Sj}  for  i,  r^i  i$  sA»  We  also  define  as 

max  Cj  I  Yj.  j  *  \k,SB  ^  k  5  for  t»  3 A  5  1  "  lA  311(1  as 

max  (j  |  Y^j  2  Vlki*B  S  k  £  m}  for  i,  (A$iSn.  We  shall  explore  some 

characteristics  of  and  j^. 

Lemma  5.1.  «i+1>i  <  Y^jU)  "  0  ^  ^  sa 

Proof:  The  condition  a.  .  .  <  y.  .  (i)  implies  A(i+1)  is  in  the  hatched 

1+1,1  i,j  ^ 

region  (refer  to  Figure  20).  Suppose  <  J^;  this  implies  that 

B(J is  in  the  crosshatched  region.  Then  it  yields  the  contradiction 
Yi+^  j(i+l)  >  y^+^  j(i)  on  Che  definition  of  J^+^.  □ 

.  is  defined  as  polar  angle  for  explanatory  purpose  only;  in  the 

*  9  J 

Implementation  of  the  operation  of  comparing  two  angles,  we  shall  avoid 
computation  of  angles  by  replacing  it  with  the  operation  of  comparing 
the  negative  values  of  their  cotangents,  where  the  function  contangent: 
(0,tt]  -  [-•,•]  1*  an  order-reversing  mapping. 

^Same  as  (1). 


Figure  20 


l£2ES_L2*  j(i)  <  j(i+1)  =»  ai+l>i  <  Vt>j(t).  0  <  i  <  sA 

Proof:  <  j  means  B(j^+^)  is  in  che  hatched  region  in  Figure  21 


Suppose  a.  .  .  2  y  (i)  which  implies  A(i+1)  is  in  che  crosshatched  region 

i*r  1 9  L  I  f  j 


We  then  have  j(i+l)  >  Y^+j.  j(i)  which  contradicts  the  definition  of 

(i+1) 


j 


By  similar  arguments  we  have  the  following  lemmas  on  j(i). 


Lemma  5.3.  a±  i+1  <  Yi>j(i)  »  j(i)  2  j(i+1),  sA  S  i  <  n. 


Lemma  5.4.  j(i)  >  j(i+1)  -  a^t+1  <  Ytj(i),  sA  ^  i  £  n  . 


We  are  going  to  use  these  lemmas  to  show  an  important  property  of 


the  sequence  of  (J^). 


Theorem  5.1.  In  the  range  0<  i  <  r^,  if  for  some  i  then 

j (i)  <  j (i+1)  <  . . .  <  j (  a) .  And  in  the  range  r.  ^  i  <  s  ,  if  j (i_1) <  j (l) 


then  j(i)  £  j(i+1)  £  ...  <,  j(*A\ 


Proof:  We  shall  show  that  if  then  «k+^  k  <  Yk»J^  for 


k  *  i-1,  i,...,h,  where  h  is  r  for  0  i  i  <  t  and  s  for  r .  S  i  £  s  . 

A  A  A  A  A 


,  (i-i) 


We  prove  by  induction  on  k.  The  basis  a.  .  1  <  Y-_i«j  Is  true  by 

fk-l'i 

Lemma  5.2.  In  the  inductive  step,  we  assume  that  cr^  k_^  <  Yk_^>j 
Then  by  Lemma  5.1,  j ^  ^  J^.  Referring  to  Figure  22,  we  have 


,  <k) 


“k  k-1  <  ^k>J  *  ^  t0  convexity,  ark+1  k  <  k-1-  Therefore,  we  have 

,  (k) 


ak  k+1  <  Yk»j  ’  l*ence>  t^ie  statement  in  Lemma  5.1  completes  che 
proof. 


T 
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Using  an  argument  similar  to  the  one  above,  we  can  establish  the 
following  theorem. 

Theorem  5.2.  In  the  range  s4  S  i  $  ij,  if  ^  for  some  i  then 

And 

then  j(i)  *  J(i+1)  *  ...  a 


j(i)  *  j(i+l)  *  j(V 


in  the  range  S  i  S  n,  if 


These  two  theorems  can  be  interpreted  as  follows: 

(0) 

Corollary  5.1.  (j  )  is  a  nonincreasing  sequence  followed  by 

(rA)  (SA}  -(SA)  -(V 

a  nondecreasing  sequence:  so  is  (j  ,  ...,j  )•  (j  )  is 

a  nondecreasing  sequence  followed  by  a  nonincreasing  sequence;  so  is 

5.2  Merging  Two  Convex  Hulls 

Given  two  convex  polygons  A(0:  n-1)  and  B(0:  m-1),  where  the  y-value 
of  A(i)  is  smaller  than  that  of  B(j)  for  0  5  i  <  n  and  0  S  J  <  m,  by  merging 
of  A  and  B  we  mean  the  determination  of  the  convex  polygon 
C(0:  j*-i*  +  £*-]*+  m-1)  which  is  obtained  by  tracing  the  two  lines  of 
support  (A(i*) ,B (j*) )  and  (A(i*)  ,B(j*))  common  to  A  and  B,  to  be  referred 
to  as  left  and  right  tangents  respectively,  and  by  eliminating  the  vertices 
of  A  and  B  which  becomes  internal  to  the  resulting  polygon  (refer  to 
Figure  23). 

* 

It  is  observed  that  if  B(r^)  is  to  the  left  of  A(r^),  then  i*  and  J* 
are  in  the  ranges  [ 0, r . ]  and  [0,r_],  respectively;  otherwise,  1*  and  j* 
are  in  the  ranges  [r. ,sA]  and  [r_,s  ]  respectively.  It  is  also  observed 

A  A  D  a 


that  if  B(4g)  is  to  the  left  of  A(i^),  then  I*  and  j*  are  in  the  intervals 
[sa>j&a]  and  [Sg.fg]  respectively,  otherwise,  I*  and  j*  are  in  the  intervals 
[£^,n]  and  respectively.  Furthermore,  the  tangents  (A(i*) ,B(j*) )  and 
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(A(i*),B(J*))  are  characterized  by  the  following  properties: 

(1)  j*  -  and  j*  -  j^*5 

^  ®i*,i*-l  >  \*j*  and  ai*,i*+l  ~^i*j*  <  n' 

aI*,I*+l  <  Yi*J*  and  ai*,i*-i  "  ^£*J*  >  n' 

Figure  24  clarifies  these  properties. 

The  Index  j*  has  another  property  which  is  not  so  obvious  as  those 
above,  as  expressed  by  the  following  lemna: 

Lemma  5.5.  j*  £  j  ^  for  0  S  1  £  rA  when  0  £  j*  <,  rfi  and  for  rA  £  1  £  sA 
when  Tj  i  ]*  1  Jj. 

He) 

Proof:  Suppose  j*  >  y  for  some  k  in  Che  appropriate  range.  Due  to 

property  (2)  of  i*  and  j*,  A(k)  must  be  In  the  hatched  region,  and  due 

fk)  (k) 

to  property  (1)  and  the  assumption  j*  >  j '  ,  B(j  )  must  be  in  the 

crosshatched  region.  We  observe  from  Figure  25  chat  y  >  Yk  j*  which 

j 

contradicts  the  definition  of  j^.  Therefore.,  j*  £  for  all  i  in  the 

specified  range.  □ 

By  a  similar  proof,  we  can  show  chat  the  index  j*  is  largest  among 
j(i),s. 


Lemma  5.6.  j*  *  j  ^  for  *A  S  i  S  iA  when  Sg  £  j*  s  2g  and  for  iA  <  i  5  n 
when  ig  £  j*  :£  ra. 

A  merging  algorithm  for  two  convex  polygons  may  consist  of  the 
following  three  major  steps: 

1.  find  j*  and  j*  ; 

2.  determine  i*  and  £*  which,  with  j*  and  j*,  satisfy  properties  (1) 
and  (2); 

3.  rearrange  the  vertices  of  the  resulting  polygon. 


A(1  -1) 


A(i*) 


A(i*+1) 


A(i*+1)  f"* — 

s 

s 

s  / 


A(i*) 


(1)  B(r0)  is  to  the  left  of  A(rA)  (li)  B(rB>  is  to  the  right  of  A(rA> 


(a)  dfj*  1*.1>  and  ®i*  i%l  "  Y1*J*  <TT 


b(T) 


A(i%i) 


A(?r 


A(i*> 


A(i*-1) 


A(i*-1) 


B(j*> 


A(i*+1) 


(i)  3(/g)  is  to  the  left  of  ACiJA>  Cii)  B(ig)  is  to  the  right  of  A(/A> 


(b)  <  Vj*  and  "PJM  ■  *!*]*  >  n 


Figure  24.  Illustration  of  properties  of  tangents. 


We  shell  describe  Che  merging  algorithm  la  more  details  la  the 


following  sectloas. 

5.3  Oa  the  Sttt  with  N  Processors 

la  this  seedoa  we  shall  presaat  a  "divide  and  conquer"  algorithm  for 
flndiag  the  convex  hull  of  a  set  of  N  points  la  the  plane  on  a  SMM  with 
N  processors.  We  shall  study  methods  for  finding  the  minimum  (maximum) 
of  a  V-bi tonic  (A-bi tonic)  sequence  and  for  merging  two  convex  polygons 
on  the  SMM. 

5.3.1  Finding  the  Minimum  (Maximum)  of  a  V-bi tonic  (A-bl tonic) 

Sequence 

Given  a  V-bi tonic  (A-bi tonic)  sequence  D(0:  n-1),  we  want  to  find 
the  smallest  (largest)  index  k  such  that  D(k)  is  a  minimum  (maximum) 
of  the  sequence.  The  index  k  has  the  property  that  D(k-1)  >  D(k)  SD(k+l) 
(D(k-l)  <i  D(k)  >  D(k+1)).  Therefore,  It  is  obvious  that  k  can  be  found  in 
constant  time  on  a  SMM  with  n  processors  and  n  memory  units. 

We  are  going  to  solve  this  problem  on  a  SMM  with  «/n  processors  and  n 
memory  units.  We  first  find  the  smallest  (largest)  index  1  such  chat  D(iVn) 
is  a  minimum  (maximum)  of  the  sequence  (DWn)  ,D(2^i) , . . .  ,D(C*/n-l)Vn)) . 

Mote  that  this  sequence  is  also  V-bitonlc  (A-bitonic).  It  is  observed  chat 
k  must  be  in  the  interval  ( (i-l)Vn+ 1,  (1+lX/n  - 1]  which  is  of  length  2,/n-l; 
(D(i-lX/£+  l),...,D(iv^)  and  D(u/a) , . . .  ,D(i+l)v^i-l))  are  both  V-bi  tonic 
sequences  of  length  Jn.  Therefore,  the  index  k  can  be  determined 
in  constant  time  with  «/n  processors.  The  function  MDL.VJBZTONIC  is  a 
formal  description  of  the  above  method  to  determine  the  index  k. 

/ 

f 
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function  MIN_v_BITONIC  (D(0:  n-1)) 

/*  this  function  returns  the  index  k  such  thet  D(k-l)  >  D(k)  <,  D(k+1), 
when  D  is  v-bitonic  sequence  */ 
begin 

foreech  j,  j  €  Cl, 2, . . .  do 

if  D((j-lVn)  >  D(Wn)  and 
D(Vn)  £  D((J+lWn) 
then  i  ~  j 

foreech  j,  j  6  l(i-l>s/wl,  (i-lVitf*, . . . ,Wn}  do 

if  D(j-l)  >  D(j)  end  D(j)  £  D(j+1)  then  k  -  j 

foreech  j,  j  6  . .  • ,  (i+l)«/n“l3  do 

if  D(j-l)  >  D(j)  end  D(j)  £  D(j+1)  then  k  -  j 

return  k 

end 


We  cen  obtein  the  function  MAX_A_BITONIC  for  a  A-bitonic  sequence  by 

interchanging  >  end  ^  in  MI1L_V_BIT0NIC . 

5.3.2  Finding  the  Common  Tangents  of  Two  Convex  Polygons 

We  now  develop  an  algorithm  for  an  SMM  for  finding  the  left  tangent 

(A(i*) ,B(j*))  and  the  right  tangent  (A(i*) ,B(j*>) ,  as  defined  in  Section  5.2 

for  a  SMM.  Let  us  consider  the  determination  of  j*.  Assume  that  B(r^)  is 

to  the  left  of  A(r^)  (the  other  case  can  be  treated  in  the  same  way). 

Since  where  OSiS  r^,  is  the  smallest  index  of  the  minimum  of  the 

v-bitonic  sequence  (V.  ),  can  be  found  in  constant  time  with 

i,rB 

*/rB+l  processors.  We  determine  for  i  *«AA+l,2^rA+l, . . . ,  (i/rA+l-lVrA+l 

(refer  to  Figure  26).  This  can  be  achieved  in  constant  time  with 

(v^r~+T-lx/r^+l  processors.  Then  we  find  the  smallest  index  i  such  that 

(i)  fc/r.+l)  (Oyr^+l-iVr.+D. 

j '  '  is  a  minimum  among  [ j  , . . . ,  j  ; .  This  can  be  done 


87 


in  time  O((log(v'r*+l“l))^)  with  \/i\-rl-l  processors  (refer  to  Sectioa  2.1.2). 

A  A 

(iVr^L+D  CiV^I+2) 

The  index  j*  is  the  smallest  in  the  set  {j  , j  , ..., 

(i-h/r  +1-1)  _  2 

j  }  of  size  2«/r^+l-l.  Therefore  j*  can  be  found  in  O((logn)  ) 

time  on  an  SMM  with  •frm  processors.  The  index  j*  can  be  determined  in  a 

similar  way.  The  indices  i*  and  £*  are  the  two  i's  which  satisfy  properties 

(1)  and  (2)  as  described  in  Section  5.2.  Knowing  j*  and  j*,  the  indices 

i*  and  I*  can  be  determined  in  constant  time  with  n  processors.  We  shall 

present  formally,  in  the  appendix,  the  procedure  TANGENTS  which  determines 

and  returns  the  indices  j*,  i*,  j*  and  £*. 

In  conclusion,  the  left  and  right  tangents  can  be  determined  in  time 
2 

O((logn)  )  with  at  most  m+n  processors.  Next,  we  shall  consider  the  entire 
convex  hulls  algorithm. 

5.3.3  Convex  Hulls  Algorithm 

As  a  preliminary  step,  we  sort  the  set  S  of  points  by  their 

2 

y  coordinates  in  descending  order.  This  can  be  done  in  0((logN)  ) 
time  with  N  processors.  The  convex  hulls  algorithm  to  be  presented  is 
a  recursive  p  ogram.  The  major  step  is  the  merging  procedure  which 
determines  the  left  and  right  tangents  of  two  convex  hulls  and 
rearranges  the  vertices  of  the  resulting  hull. 
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function  CH21  (S) 

/*  returns  CH(S);  S  is  a  set  of  N  points  in  the  plane  */ 
begin  if  N  ^  2  then  return  (£) 

return  (MERGE 1(CH21(S (N/2 :  N-l)) ,CH21(S (0  :  N/2-1)))) 

end 

function  M£RGEl(A,B) : 

/*  returns  the  convex  hull  of  polygons  A  and  B  */ 
begin 

(j*,  i*,  j*,  i*)  -  TANGENTS  1 (A, B) 

foreach  k,  0  ^  k  ^  j*  do  C(k)  **  B(k) 

foreach  k,  i*  S  k  S  i*  do  C(j*  -  i*  +  1  +  k)  ~  A(k) 

foreach  k,  j*  ^  k  <  m  do  C  ( j*  -  i*  +  2  +  I*  -  j*  +  k)  *-  B  (k) 

return  (C(0:  j*  -  i*  +  i*  -  j*  +  m  -  1)) 

end 

The  running  time  T(N)  of  function  CH21  can  be  obtained  by 

recurrence  relation  T(N)  ^  T(N/2)+M(N),  where  M(N)  is  the  running  time 

of  function  MERGE 1.  We  have  shown  that  the  tangents  can  be  found  in 
2 

0((logN/2)  )  with  N  processors,  and  it  obvious  that  the  rearrangement 

2 

can  be  done  in  constant  time.  Therefore,  M(N)  ■  0((logN)  ).  Hence 
T(N)  -  0((logN)2). 

Theorem  5.3.  The  convex  hull  of  a  set  of  N  points  in  the  plane  can  be 

2 

determined  in  time  0((logN)  )  on  a  SMM  with  N  processors  and  N 
memory  uni ts . 

5.4  On  the  CCC  with  N  Processors 

In  this  section  we  discuss  how  the  convex  hulls  algorithm  developed 
in  Section  5.3  can  be  implemented  on  a  CCC  with  N  processors  inO((logN) 
parallel  steps.  We  shall  discuss  the  data  movement  in  detail. 


PARALLEL  ALGORITHMS  /OR  GEOMETRIC  PROBLEMS(U)  ILLINOIS 
UNI V  AT  URBANA  APPLIED  COMPUTATION  THEORY  GROUP 
A  L'CHOW  DEC  81  ACT-30  N0OO14-79-C-Q424 


MICROCOPY  RESOLUTION  TEST  CHART 
NATIONAL  BUREAU  OF  STANDARDS-  1963-A 
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5.4.1 


The  function  TANGENTS  1  introduced  in  Section  5.3.2  for  determining 
the  indices  of  the  extremes  of  the  left  end  the  right  tangents  of  taro 
convex  polygons  cannot  be  directly  implemented  on  a  CCC.  We  shall  make 
some  modifications  to  TANGENTS  1  so  that  it  will  be  suitable  for 
implementation  on  a  CCC. 

Using  the  facts  that  j*  is  the  mininum  among  the  j^'s  and  that  the 
sequences  of  Y.  *s  are  v -bitonic,  we  can  determine  j*  as  follows. 

*  t  J 

First  of  all  (refer  to  Figure  27  for  the  following  discussion),  we 
describe  how  to  determine  sinultaneously  a  set  of  integers 
£j(i)  .i  -</rA+l,2^rA+l, ....  C/r^+1- 1)^+1} ,  where  YiJ(i)  - 

. Vl,<^+l-l>v'yT3  1£  B(V  ls  “  th*  U£e  o£ 

A(r  ) ;  and  a  set  of  integers  [j(i),i  -  r  Ws.-r  +l,r  +2^s  -r  +1 . 


Vl '  rB+VVrB+l . Yi  -  rB  +  WVCB+1 


/"  Jr  1  if  B(rB)  is  not  to  the 
B  B 


left  of  A(r^) •  We  now  consider  two  duplicating  patterns  of  a  data  array 
0(0:  q-i);  (i)  the  first  pattern,  to  be  referred  to  as  Pl(£)  consists  in 
duplicating  0  A  times  into  £D(0)  ,D(1) , . . .  ,D(q-l)  ,D(0) , . . .  ,D(q-l) , . . .} 

(ii)  the  second  pattern,  to  be  referred  to  as  ?2(i),  consists  in 
duplicating  each  element  of  0  l  times  into  £0(0) ,D(0) , . . . ,0(0) ,D(1) , . . . , 

0(1) , . . . ,D(q-l) . D(q-l)}.  Both  patterns  have  q*£  elements.  The  first 

pattern  Pl(j£)  can  be  achieved  by  copying  each  element  of  (0(0) f . . . ,D(q-l)} 
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into  eh*  module  q  positions  away,  then  copying  each  elaoanc  of 
(D(0),...,D(q-l),D(0),...,D(q-l)}  into  eh*  modul*  2q  positions  away, 
and  *o  on.  Ie  will  eak*  logariefamic  seap*  to  achieve  eh*  pattern  Pl(4). 

We  achieve  eh*  second  pattern  P2(i)  aa  follows.  We  copy  D(0),D(1) , . . ., 
D(q-l)  ineo  modules  0,4,24, . . .,  (q-l)i  respectively  by  a  reverse  process 
of  the  concentration  procedure  described  in  Section  2.2.1.  We  then 
perform  a  selected  broadcasting  as  described  in  Section  2.2.2  to  achieve 
pattern  P2(4).  Recall  that  both  of  these  operations  can  be  achieved  in 
logarithmic  time.  Therefore,  both  patterns  can  be  achieved  on  a  CCC  with 
q *4  processors  in  0(log(q •£))  steps.  We  shall  discuss  only  the  case  that 
B(rB)  is  to  the  left  of  A(rA);  The  other  case  can  be  treated  in  a  similar 
manner .  We  duplicate  {B  G/r^+l)  ,B(2s/r^+I) , . . .  ,B  (  (/r^+l-l)«^+l)  }  into 
pattern  Pl<^/r“+T-l)  and  UCv^+l)  ,A(VrA+l) , . . .  ,A((^rA+l-lX/rA+l)}  into 
pattern  P2<^rB+l-l).  Now  we  can  c°mput*tYy£^j(y^^ 


fy*A+l,  C</rB+l-lWrB+i * YVrA+l rB+l . YVrA+l,  (Jv^+l-lUv^+V  *  ’ 

in  constant  time.  Since  sequences  (Yt * "  *  ’ ,  (Jr  +l-l)Vr  +1^ *  for 

B  B  B 

i  "VrA+l,VrA+l, . . . ,  (/rA+l-l),/rA+l,  are  v-bi tonic,  the  indices  J(i)'s  of 

the  minima  of  the  sequences  can  be  determined  in  O(log^s)  time.  Figure 

27(a)  shows  three  V-bi  tonic  sequences  +l»^i  zjr  +1**"^’  ^or 

B  B 

i  « VrA+l , 2VrA+l » V*A+1  >  and  the  values  of  J(i).  The  index  j'#  the 

minimum  of  J(i),  can  be  determined  in  0(log*/nm)  time  on  the  CCC. 

W*  then  determine  J'(i),  where  Yt>J, (1)  -  «iatY 

B  B 


* •  •  #Y^  jf  i  •••  pYt  j »  + ^  f  •  •  * > 0Aa+1- 1 

B 
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II 

in  Che  seme  wey  u  we  deteraine  J(i).  We  also  find  I  which  is  Che 

smallest  index  such  that  J'(i)  is  a  minimum  among  (J * Cs/r^+1) ,  J '  (27^+1) ^ 

J'  G/r.+l-l^/r.+I}}  .  It  is  easy  to  show  that  I  be  determined  in  O(log</nm) 

A  A  -  -  I 

on  the  CCC.  Now  j*  is  the  minimum  of  , J <±+lc— X> ^  antj  .. 

can  be  found  in  a  procedure  similar  to  the  one  given  above.  The  j 

procedure  R_IANGENT_INDEX,  which  is  a  formal  description  of  what  we 
discussed  above,  will  be  presented  in  Che  appendix. 


In  an  analogous  way,  we  can  describe  a  procedure  L_TANGENT_INDEX  (A,B) 
which  returns  j*.  Knowing  j*  and  j*,  we  can  determine  i*  and  I*  by 
finding  pairs  of  (i',j*)  and  (i",j*)  which  satisfy  properties  (1)  and  (2) 
defined  in  Section  5.2. 


function  TANGENTS 2 (A, B) 


/*  return  the  indices  of  the  extremes  of  left  and  right  tangent 
of  A  and  B  */ 
begin 


/*  determine  j*  and  j*  */ 
1*  -  R— TANGENT_INDEX (A , B ) 
j*  -  L_IANGENT_INDEX (A , B ) 


I*  determine  i*  and  I*  with  which  j*  and  j*  respectively 
satisfy  property  (1)  and  (2)  */ 

if  x-values  of  B(rB)  <  x- values  of  A(rA) 

then  begin  a  •*  0;  b  •-  r^;  and ;  else  begin  a  •*  r^;  b  •*  s^;  end 

foreach  1,  iSllb  do 

*  Yi,j* s  *> 

4S4  «!,!.!  >  Yi,j.  B&  «lll+l  -  Y1(j.  <  «  /*  p«p.rty  (2)  */ 
then  i*  ”  i  1 


if  x-values  of  B (Xg)  <  x-values  of  A(XA) 

then  begin  a  *-  s^;  b  *■  XA;  end:  else  begin 
foreach  i,  a  i  1  S  b  do 


K 

K 
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tad 


—  *  Yi,j*  s  vi, j«+i  '*  *  *' 

a±  “l.i+i  *  Yi,j*  “i,i-i  -  Yi,j*  ”  " 
chtn  i*  -  t 
return  (j*,i*. 


Therefore,  the  left  end  right  tengents  cen  be  determined  in  tine 
0 (log(n+m) )  on  e  CCC  with  rtfm  processors.  Next,  we  shell  consider  the 
entire  convex  hulls  elgorithn 

5.4.2  Convex  Hulls  Algorithm 

We  presort  the  set  S  of  points  by  their  y  coordinates  in  descending 

2 

order.  This  cen  be  done  in  dne  0((logN)  )  on  e  CCC  with  N  processors 
[31].  The  convex  hulls  algorithm  has  the  same  structure  as  the  one 
described  in  Section  5.3.3.  The  main  difference  is  in  the  merging  step. 


function  MERGE2(A,B) 

begin  /*  determine  the  tengents  */ 

*“  TANGENTS 2 (A, B) 

/*  reorder  the  vertices  *f 

foreach  i,  OS  i  <  n  do  T2(i)  —  A(i) 

foreach  i,  0  <  i  <  m  do  Tl(i)  -  T3(i)  -  B(i) 

if  j*+l  >  i*  than  shift  T2  forward  by  J*+l-i*  positions 

else  shift  T2  backward  by  i*-j*-l  positions 
if  (j*+l+I*-i*)  >  j*  then  shift  T3  forward  by  positions 

else  shift  T3  backward  by  j*-(j*+I*-i*+-2) 

positions 

foreach  i,  0  5  i  S  j*  do  C(i)  -  Tl(i) 
foreach  i,  j*+l  Hi  j*H*-i*M-l  do  C(i)  ~  T2(i) 
foreach  i,  j*+i*-i*+2  &  i  £  j*+I*-i*+2-Hn- j* 
do  C(i)  -  T3(i2 
return  (C(0:  J*-i*4-i*-J*+«rt-l)) 

end 

Cyclic  forward  or  backward  shift  of  an  array  of  data  can  be 


implemented  on  a  CCC  with  nrHn  processors  in  0(log(n+m))  parallel  steps. 

Therefore,  MERGE2  runs  in  time  O(log(nfm)>  on  a  CCC  with  n+m  processors. 

2 

We  immediately  obtain  an  0((logN)  )  algorithm  for  finding  the  convex  hull 


of  N  points  in  the  plane. 


function  CH22  (SCO:  N-l)): 

/*  returns  CH(S);  S  is  presorted  by  y  coordinetes  in  descending  order  */ 
begin  If  N  S  2  then  return  (S) 

else  return  (MERGE2  (CH22  (S  (N/2:N-1))  ,CH22(S  (0:N/2-l)))) ; 

end  . 

Theorem  5.4.  The  convex  hull  of  s  set  of  N  points  in  the  plane  can  be 

2 

determined  in  time  0((logN)  )  on  a  CCC  with  N  processors. 

5.5  On  She  CCC  with  2Nl4<*  Processors 

In  this  section  we  shall  develop  a  "divide  and  conquer"  algorithm 
for  finding  the  convex  hull  of  a  set  S  of  N  points  in  the  plane  on  a  CCC 
with  2N1+0f  processors,  0  <  a  S  1.  We  partition  S  into  N*  subsets 

1-TY 

Sn,S.,...,S  of  N  elements  each.  We  then  determine  convex  hulls 
0  1  N®-1 

CH(Sn) , . . • ,CH(S  )  simultaneously.  Finally  CH(Sn) , . . . ,CH(S  )  are 
0  n“-1  0  N“-l 

merged  to  give  CH(S).  Since  the  determinations  of  CH(S0) , . . . ,CH(S 

N 

recursive  calls,  we  obtain  for  the  running  time  T(N)  of  this  algorithm 
the  recurrence  relation 

T(N)  -  TCN1"®)  +  H(N) , 

where  M(N)  is  the  time  to  merge  CH(Sn) , . . . ,CH(S  ).  If  we  can  show  that 

„  “"-1  i« 

n  convex  hulls  can  be  merged  in  time  O(logN)  with  2N  processors, 

then  we  have  T(N)  •  0 (j  log  N). 

We  shall  define  soma  terms  and  then  describe  the  merger,  which  is  a 
major  part  of  our  convex  hulls  algorithm. 

5.5.1  Notations  and  Definitions 

Consider  a  set  of  polygons  Aq,A^,...,A  a  (0  <  or  £  1),  each  having  at 
*  n  *  1 

most  n1  a  vertices.  Each  A^  is  in  standard  form,  that  is  A^(0:  n^-1) 

is  the  clockwise  sequence  of  its  vertices  starting  with  the  one  with  largest 

y  coordinate.  Variables  ,s^,X^  denote  the  Indices  of  the  topmost 
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rightmost,  bottonsnoat  and  leftmost  vertices  of  A  .  We  assume  that  the 
y-coordlnates  of  A^CO:  n^-1)  less  than  those  of  A^(0:  n^-1)  for  k  >  l , 
that  is  in  any  horizontal  slab  there  will  be  only  one  A^.  The  indices 
of  the  extremes  of  the  left  and  the  right  tangents  of  A^  and  A^(k  >  l) 
are  J*k  i^*V.  i*^*k  f’1* k  £  resP*ctivcly  (refer  to  Figure  28).  We  define 
the  polar  angles  6k^  -  9 (A^i*^ ^  »Ax ^*k,x >  >  and 

K.l  ■  M»-<1) 

5.5.2  Merging  Multiple  Convex  Hulls 

We  shall  discuss  how  to  marge  the  set  of  convex  polygons, 

A  ,...,A  ,  as  introduced  in  Section  5.5.1.  Like  merging  two  convex 

0  N®-1 

polygons,  we  have  to  determine  those  vertices  belonging  to  the  resulting 
convex  hull  and  those  becoming  internal  to  the  resulting  convex  hull;  then 
we  have  to  rearrange  the  vertices.  We  shall  develop  some  preliminary  tools 
first. 

Tamm.  s,7.  ^  k  <  5£  |  or  ^  k  ^  f  and  l  <  k,  for  k  and  l  <  i,  then 

k^,Ak^*i  k^  is  aOC  ed8*  °f  ****  resulting  convex  hull  of 

Aq, . .  .,A  . 

N-l 

Proof:  We  have  to  consider  two  cases  (a)  L  <  k  and  (b)  l  >  k.  Referring 
to  Figure  29,  in  both  cases,  the  edge  (A^i*^  becomes  internal 

to  the  edge  (A^I*^)  .A^CJ*^))  *  ^ 


^In  the  implementation,  the  operation  of  comparing  two  angles  will 
be  replaced  by  the  operation  of  comparing  the  negative  values  of 
their  cotangents  as  in  the  case  of  or,  .  and  Y.  . . 
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We  associate  with  each  polygon  an  indax  t(i)(<  i)  which  la  tha 
smallest  Index  such  that  a  ®l,k*  ®  ^  k  <  D#in*  ^*nwl*  5.7, 

wa  have  tha  following  raault. 

Corollary  3.2.  Among  all  adgaa  0^(1^  k) .A^d*^))  (0  £  k  <  1), 

(Ai (i*i ,t(i)^‘At(l) ^*1 , t (1 ) ^  <t0  b*  r*ferred  to  “  *d**  candidate)  la 
tha  only  candldata  for  halng  an  adga  of  tha  rasulting  convex  hull  of 


V**“V-l‘ 

Wa  now  consider  polygons  below  A^ . 

Lemma  5.8.  If  6^  ^  ^  or  i  "  1  and  ^  <  A  £°r  kfA  >  *■ 

(At(I*k  ,A^(l*k  ^) )  la  not  an  edge  of  the  resulting  convex  hull  of 

Aq, . , .,A  . 

N-l 

Proof :  We  have  considered  two  cases  (a)  k  <  l  and  (b)  k  >  l .  Referring 
to  Figure  30,  in  both  cases,  the  edge  (A^ (J*k  .A^Ci*^  ^) )  become  internal 
to  edge  CA1(3*jj>1),AJ|(£*jj^1)).  □ 

We  associate  with  each  A^  an  index  b(i)  (>  1)  which  is  the  largest 
indax  such  that  £  - .. .  15.  ,,  i  <  ki  N^-l.  Again  using  Lemma  5.8, 

we  have  this  result. 

Corollary  5.3.  Among  all  edges  (At(j*k ,i),Afc(i*ktl))  (i  <  k  S  N^-l), 

(Ai(j*b(i),i),Ab(i)(i*b(i),i))  (t0  b*  referred  to  as  edge  caqdldat;a) 

is  tha  only  candidate  for  being  an  edge  of  tha  convex  hull  of 


A° . V-i' 

Wa  are  now  able  to  determine  if  the  edge  candidates  are  edges  of 
the  convex  hull  of  Aq,  •••»Aa_j_  as  follows. 
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Theorem  5.5.  The  edge  candidates  are  edges  of  the  convex  hull  of 

A0 . V-!  “  “*  °“l!r  1£  *  5*b(l),i  °r  “4 

“  ■  9(Ai^*b(l),l'’  Ab(t)^*b(l),l)'  >n>' 

Proof:  Suppose  <  j*g^  ^  (refer  to  Figure  31(a))  or 

»  j*g^j  ^  and  do  <l  tt  (refer  to  Figure  31(b)).  We  have 
5b(l),i  <  5b(l),t«)  AOd  ‘l.i<l>  >  9b(l),c(l)*  Ihu‘>  by  L,"“s  5‘7  ind  5-8- 

«<l«es  (Ab(l)(A*5(l),i),Ai<^*b(i),l>  4nd  At<l)<j*t,ta))'Al(I*i,E<l>))  Are 

not  edges  of  convex  hull  of  AQ,...,A  . 

U  N  -1 

Suppose  r*t(i(l)  >  >5(1),!  <ref,r  "  F1«“r*  31<c>)  or  {*i,E(l)"j*b(i),i 
and  do  >  tt  (refer  to  Figure  31  (d)).  By  the  definitions  of  t(i)  and  b(i), 

all  An,...,A  are  on  the  same  side  of  the  edge  candidates.  Thus,  the 
N  -i 

candidates  are  edges  of  convex  hull  of  AQ, . . .  ,A  .  □ 

N  -1 

We  now  describe  the  analog  for  the  right  tangents.  The  index  t(i)  is 
the  smallest  one  such  that  S  0^  k,  0  <  k  <  i.  And  the  index 

b(i)  is  the  largest  such  that  i  *  ^k  i’  *■  <  k  —  Na-1.  We  shall  state 

without  proof  the  analogous  lemmas,  corollaries,  and  theorems  for  the 
right  tangents. 

Lemma  5.9.  If  ^  or  k  *  ^i  t  and  ^  <  f°t  k»X  <  1  then 

<Vl*t,k>-VJV  ^))  is  not  an  edge  of  the  resulting  convex  hull  of 

fT-1 

Corollary  5.4.  Among  all  edges  (A^ (i*i  k))  (0  S  k  <  i), 

(Ai^i*i,t(i)),At(i)^*i,t(i)^  is  the  only  edge  candidaCe- 

Lemma  3.10.  L  K  ^ l  t  OT  L  m  ® l  i  atld  h  <  l,  for  k,X  >  i  then 

<Ai<J\,i>-Vi*k,  j:))  is  not  an  edge  of  the  convex  hull  of  Aq,...,A 

’  ’  It  -1 
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Corollary  5.5.  Among  all  edges  (A^j*^  i^,Ak^*k  ^  ^ 

(Ai (J *b (i ) , i 5 ’ Ab (i ) (1*b (i ) , t 5 5  iS  ***  °nly  *d8#  candid4te* 

Theorem  5.6.  The  edge  candidates  are  edges  of  the  convex  hull  of 

Ao . Y.J  11  “*  °“ly  lf  i\,c(i)  <  J*b(l),l  or  (1*t,c(i)'J*b<l),t 

anJ  «  -  e<*t(J*b(l),i>lA»(l)<1*b(l),l))  •6<V1*l,t(l))'At(l)(J*i,c(i>))  <  n) 

Before  discussing  how  to  obtain  Indices  t(l),  b(i),  t(i),  and  b(i), 
etc.,  we  present  an  example  of  merging  five  convex  polygons  In  Figure  32. 

In  Figure  32,  6^q  >  Sji*  therefore  by  Letmna  5.7  and  Corollary  5.2, 

(Aj  (1*2  ^  g) , Aq ( j*2 ^  0))  Is  an  edge  candidate  while  edge  ^(1*2  1)»Aq(J*2  ^)) 
is  eliminated.  Also  542  >  5  332  therefore  by  Lemna  5.8  and  Corollary  5.3, 
(^(5*32)^3(1*32))  is  an  edge  candidate  while  edge  (A2(j*42)  ,A4(i-  *42)) 
is  eliminated.  However,  by  Theorem  5.5,  both  of  these  edge  candidates 
will  be  eliminated  because  1*2  q  <  j*^  2*  With  similar  arguments,  all 
lines  of  support,  except  those  shown  in  the  figure,  will  be  eliminated. 


The  resulting  convex  hull  is  (A0(0),AQ(1), . . .  ,AQ(j*1>0)  .Aj^i*^), 

A1 (l*i , o+1) » ‘ • * » Ai (J*4 , l) • A4 (i#4 , 1> » A4  <i*4 f  j+1) . • • • . A4 (i*4  ,  3  )  ,  A3 ( j*4 ^  3 ) 
A3(j*4  3+I) , . . . ,A3 (1*3  ^) >aq(5*3  q)»Aq(5*3  Q*i) » • • • »AQ(nQ~l) ) •  We  now 
discuss  how  to  obtain  the  resulting  convex  hull  in  the  general  case. 

We  first  copy  Aq,...,A  into  the  followin^pattern  P3: 


AaAa  •  •  *AaA  A« A« Aa Aa  •  •  »A«  A  •  •  «A  A* A  A«*  •  •  *A  A  « 

°“i  0*2  1112  1  jar.t  rf*.!  1  „».!*»  rf».l 

Therefore,  pairs  of  polygon  A^A^,  k  <  i  and  i  »  0, ...jN®-!,  are  adjacent. 
We  then  use  the  procedure  TANGENTS 2 (A^A^)  in  Section  5.4.1  to  determine 
j*^  k’^*i  k  and  ^*i  k'  T*ie  nuober  °*  Processors  required  in  the 

copying  is  N®2(N®-1)  •N^’~®  <  ZN^"1"®,  and  it  can  be  achieved  in  O(logN) 


parallel  steps  with  some  simple-minded  algorithm.  Determination  of  the 


indices  t(i),  b(i),  t(i),  and  b(i)  involves  finding  minimim  and  maximum 
of  multisets  of  uniform  size;  so  it  can  be  achieved  in  O(logN)  seeps. 

Using  Theorems  5.5  and  5.6,  we  can  determine  whether  A^(i*^  t^)» 

vn<t),iK  *“*  Ai(J'*ba),i)  *r*  v*rtlc“  °* ch*  cM”x 

hull  of  A-,..., A  •  Rearranging  vertices  of  the  resulting  convex  hull 

0  if-i 

involves  order  reversing  and  data  extraction;  both  can  be  carried  out  in 
o(logN) .  Although  the  details  of  this  algorithm  are  a  bit  tedious  to 
describe,  it  should  be  clear  that  merging  N®  convex  polygons,  each  having 
at  most  n  vertices,  can  be  performed  on  a  CCC  with  2N  processors 
in  time  O(logN). 

The  entire  convex  hulls  algorithm  is  a  "divide  and  conquer"  program. 
The  subproblems  are  solved  recursively  in  parallel.  Therefore,  the 
running  time  of  this  algorithm  is  0 logM) . 

Theorem  5.7.  The  convex  hull  of  a  set  of  N  points  in  the  plane  can  be 
determined  in  time  0(j  logN)  on  a  CCC  with  N1+°  processors,  0  <  cr  <  1. 
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CHAPTER  6 

CONVEX  HULLS  OF  SETS  OF  POINTS  IN  THREE  DIMENSIONS 
The  convex  hull  of  e  set  of  points  In  three  dimensions  Is  e  convex 
polyhedron.  A  convex  polyhedron  Is  specified  completely  by  Its  edges 
end  faces.  It  Is  represented  by  the  arrays  of  edges  E(0:  |e|  *1)  and  of 
faces  F(0:  { F |  >1).  It  Is  a  crucial  observation  that  the  set  of  edges 
of  a  convex  polyhedron  forms  a  planar  graph:  if  we  exclude  degeneracies.  It 
forms  a  triangulation.  Thus,  we  know  that  |e|  and  |f|  are  at  most  3N-6 
and  2N-4  respectively,  by  Euler's  polyhedron  theorem,  where  n(i  3)  is  the 
number  of  vertices . 

In  [30],  Preparata  and  Hong  show  that  the  convex  hull  of  a  sec  of  N 
points  in  three  dimensions  can  be  determined  serially  with  O(NlogN) 
c.. radons.  Their  algorithm  uses  the  "divide  and  conquer"  technique  and 
recursively  applies  a  merge  procedure  for  two  nonintersecting  convex 
hulls  which  consists  of  two  mejor  steps:  (1)  construcdon  of  a 
"cylindrical"  crianguladon  J,  which  is  tangent  to  the  convex  hulls  along 
two  circuits;  (2)  removal  from  both  convex  hulls  of  the  respecdve  pordons 
which  have  been  "obscured"  by  J.  In  this  chapter,  this  soludon  is 
reorganized  so  chat  parallel  operadons  are  possible. 

6.1  Definidons  and  Preliminaries 

We  consider  a  convex  polyhedron  with  edges  E(0:  |e|-1)  and  faces 
F(0:  |F|-l).  Element  E(i)  is  a  record  consisdng  of  fields:  Vj^  and  V2 
which  are  the  extremes  of  this  edge;  F^  and  F2  which  are  indices  of 
the  two  faces  bounded  by  this  edge.  Each  element  F(i)  is  also  a  record 
of  three  fields:  E^,  E2>  and  E^  which  are  indices  of  the  three  bounding 
edges  of  F(i). 
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We  can  represent  face  F^  by  an  equation  ajX  +  jJj.y  +  Yjt+S^  «0  with 
normal  vector  (a^.b^.c^)  pointing  away  from  the  polyhedron,  where 


a 


i 


The  convex  angle  formed  by  faces  F^  and  Fj  with  normal  vectors  (a^.b^.c^) 

and  <a^,bj,Cj>  respectively  is  cos"’1<a1,bi,ct>*<aj,bj,Cj>  which  is 
•  1 

cos  +  bibj  +  cicj^*  *n  r*Q8*  0  S  9  S  tt,  the  function  cos  0  is 

decreasing  from  1  to  -1;  so  the  inverse  function  cos”*a  decreases  as  a 
increases.  Note  that  the  distance  between  two  points  (a^.b^.c^)  and 
(a^.bj.Cj)  is  +  btbj  +  CjCj)),  since  a* +b* +c*  -  a* +  b* +  c*  -  1. 

Therefore,  cos'^Xa^a  ,  +•  b^b^  +  c^)  decreases  as  V2(l- (a^a^  +  b^bj  +  c^Cj ) 
decreases  and  we  conclude  this  discussion  by  the  following  theorem. 

Theorem  6.1.  The  convex  angle  that  face  F^  with  normal  vector  (a^b^c^) 
forms  with  face  F^  with  normal  vector  (a^.b^Cj)  decreases  as  the  distance 
between  points  (a^.b^jC^)  and  (a^.b^Cj)  decreases. 

6.2  Merging  Two  Convex  Polyhedra 

Consider  two  nonintersecting  convex  polyhedra  A  and  B  with  edge  sets 
eA<0:  |*J-1)  £3(0:  Jljl  •1)  respectively,  and  with  face  sets 

FA(0:  I Fj -1)  and  Fa(0:  |Fa|-l)  respectively.  Ue  obtain  the  convex  hull 
CH(A,B)  of  A  and  B  in  two  steps:  removal  from  A  and  B  of  the  faces  which  do 
not  belong  to  CH(A,B)  (these  faces  will  be  referred  to  as  Internal  faces); 
and  addition  of  faces  which  are  tangent  to  A  and  B  along  two  circuits 
(which  will  be  defined  later). 
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6.2.1  Removal  of  Internal  Faces 

Consider  Che  half-spaces  bounded  by  F^Ci)  of  A;  we  denote  the 

half-space  that  contains  A  by  H(A,i)  and  denote  the  other  one  that  does 

not  contain  A  by  H(A,i).  Face  F  (1)  belongs  to  CH(A,B)  if  B  lies  in  the 

A 

half-space  H(A,i).  Consider  the  pair  of  parallel  planes  of  support  PL^(i) 
and  PL”(i) ,  which  are  parallel  to  face  F^(i)  and  bounding  the  convex 
polyhedron  B.  We  define  the  two  associated  faces  FgCi')  and  Fg(i")  of 
F.(i)  as  follows:  F.(i')  is  a  face  of  B  making  the  smallest  angle  with 
PL^(i)  among  all  the  faces  of  B  that  Intersect  at  the  point  of  tangency 

with  PL^(i);  and  Fg(i")  is  a  face  of  B  making  the  smallest  angle  with  PL"(i) 

among  all  the  faces  of  B  that  intersect  at  the  point  of  tangency  with  PL“(i). 

Due  to  convexity,  every  face  of  B  is  in  H(A,i)  if  Fg(i")  and  FB(i')  are  in 

H(A,i) .  We  demonstrate  what  we  have  just  discussed  by  a  two-dimensional 

analogy  in  Figure  33.  FA(i)  will  belong  to  CH(A,B)  because  FB(i")  and 

F-(i')  are  in  half-space  H(A,i)  while  F.  (j)  will  become  internal  to  CH(A,B) 
o  A 

because  ( J  * )  is  inH(A,j). 

We  now  describe  how  to  determine  the  associated  faces  of  F^Ci) . 

We  first  transform  faces  Fg(0:  lFg|-l)  of  B  into  points  Pg(0:  |fb|-1) 

on  the  surface  of  the  unit  sphere,  where  Pg(j)  •  (aj,bj,Cj)  and 

(a.,b.,c.)  is  the  normal  vector,  pointing  away  from  B,  of  F„(j).  We 
J  J  J  B 

search  in  Pg(Q:  J Ffi j -1)  for  the  nearest  neighbors  PgCi")  and  Pg(i')  of 
(a^b^c^  and  (-a^-b^-c^  respectively,  where  (a^b^c^  is  the 
normal  vector  of  F^(i).  By  Theorem  6.1,  FB(i")  and  FB(i')  are  the 
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associated  faces  of  F^(i).  We  shall  perform  repeatedly  nearest  neighbor 
searches  for  all  points  +  (a^b^c^  on  Pg(0:  | | -1) ;  therefore,  it  is 
beneficial  to  arrange  Pg(0:  | Fg | — 1)  into  an  organized  structure  to 
facilitate  searching.  Since  Pg(0:  | Fg | -1)  is  on  the  surface  of  die  unit 
sphere,  we  can  construct  a  spherical  Voronoi  diagram  [8]  of  Pg(0:  |pg|-l). 
A  spherical  Voronoi  diagram  of  a  set  of  points  P(0:  n-1)  on  e  sphere  is  a 
partition  of  the  surface  of  the  sphere  into  n  regions:  region  1  for  P(i) 
is  the  locus  of  points  on  the  surface  of  the  sphere  which  are  closer  to 
P(i)  than  to  any  ocher  point  in  P(0:  n-1).  The  problem  of  all  nearest 
neighbors  searching  is  solved  by  performing  point  locations  in  the 
spherical  Voronoi  diagram. 

In  [8],  Brown  presents  an  algorithm  for  constructing  the  spherical 
Voronoi  diagram  of  a  set  of  n  points  P(0:  n-1)  on  the  surface  of  a  sphere 
by  intersecting  half-spaces.  For  each  point  P(i)  there  is  a  plane  PL(i) 
tangent  to  the  sphere  at  point  P(i).  Let  H(i)  be  the  half -space  bounded 
by  PL(i)  which  contains  the  entire  sphere.  The  intersection  of  the  n 
half-spaces  H(i)  forms  a  convex  body  C.  The  spherical  Voronoi  diagram  is 
now  obtained  by  a  simple  projection  of  the  edges  of  this  polyhedron  to  the 
surface  of  the  sphere.  This  projection  is  a  "radial"  projection:  the 
projection  of  a  point  Q  is  the  point  where  a  line  segment  connecting  the 
center  of  the  sphere  and  point  Q  intersects  the  sphere.  This  projection 
maps  edges  of  the  polyhedron  to  arcs  of  great  circles  on  the  sphere. 

The  vertices  of  the  polyhedron  are  mapped  to  spherical  Voronoi  points  and 
the  faces  of  the  polyhedron  are  mapped  to  spherical  Voronoi  regions. 
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Lee  a.x  +  J.y  +  y.i  +  J.  ■  0  be  Che  equation  of  face  F. (1)  with 
1111  B 

normal  vector  (a^.fi^.c^)  pointing  from  B.  Then  the  plane  PL(1)  tangent 
to  the  unit  sphere  at  point  (a^,b^,c^)  has  equation 

®lx  +  ^iy  +  \z  "  ifTTfT^  f  that  Is  PL(1)  Is  obtained  from  FB(1)  by 
a  translation.  Figure  34  shows  the  two •dimensional  analogy  of  the  translation 
of  faces  of  B.  Therefore,  the  intersection  of  PL(i)  and  ?L(j)  is  an  edge 
of  C  if  and  only  if  Fg(i)  and  FgQ)  are  adjacent. 

6.2.2  Addition  of  New  Faces 

In  addition  to  the  removal  of  internal  faces,  we  have  to  construct  faces 
which  are  tangent  to  A  and  B  along  two  circuits  and  (refer  to  Figure 

35).  The  circuit  is  composed  of  edges  EA(i)  of  A  such  that  EA(i) (F^l  is 
an  internal  face  and  EA(i)[F2l  is  not  or  vice  versa.  The  edges  in  are 
determined  in  the  same  manner.  We  have  to  describe  a  criterion  for  uniquely 
ordering  the  edges  in  C.  and  C_.  We  define  observer  B  as  an  observer 
placed  at  any  point  of  B  and  oreinted  like  the  negative  z-axis;  and  observer 
A  as  an  observer  placed  at  any  point  of  A  and  oriented  like  the  positive 
z-axls.  The  edges  in  are  numbered  in  ascending  order  so  chat  they  form 

a  clockwise  sequence  for  an  observer  B.  And  the  edges  in  C_  are  numbered 

•  B 

in  ascending  order  so  chat  they  form  a  counterclockwise  sequence  for  an 
observer  A.  We  start  both  saquences  at  the  vertices  with  largest 
y-coordinates  in  CA  and  accordingly.  Let  0^(1) (V^l  and  C^CjHV^l  be  the 
vertices  at  which  edges  CA(i)  and  Cg(J)  originate  respectively.  Then 
(CA(0)(vi],cA(1)[V1],...)  and  (CB(0)[V1],CB(1)[V1],...)  are  the  sequences 


7 


of  vertices  of  and  Cg  respectively.  Due  to  convexity,  the  convex 

angle  formed  by  (CA(0) [VJ ,CA(i) [VJ)  and  (C^OHVJ  ,CA(j)  [Vj)  is 

clockwise  for  an  observer  B,  where  i  <  j;  the  convex  angle  formed  by 

(CB(0)[V^]]  .CgCi) [V'1] )  and  (Cg(0)[V^],Cg(j)[V^])  is  counterclockwise  for 

an  observer  A,  where  i  <  j.  Therefore,  edges  in  CA  can  be  ordered  by 

some  simple  sorting  algorithm,  and  so  those  in  C_. 

o 

We  define  an  angle  measure  9A(i,j),^  associated  with  edge  CA(i)  and 
vertex  Cg(j)[V^],  as  the  convex  angle  formed  by  the  plane  determined  by 
CA(i)  and  Cg(j)[V^]  and  the  face  bounded  by  CA(i),  which  belongs  to 
CH(A,B) .  In  an  analogous  manner,  we  define  9g(j,i)  as  the  convex  angle 
formed  by  the  plane  determined  by  Cg(j)  and  CA(i)[V^]  and  the  face  bounded 
by  Cg(j),  which  belongs  to  CH(A,B).  We  also  define  as  the  smallest 

index  such  that  ©^(i.j^)  is  a  maximum  among  all  9^(1, j),  0  j  <  |Cg|; 
i^  as  Che  largest  index  such  that  9g(j,i<'i^)  is  a  maximum  among  all 
9  ( j  ,i) ,  0<i<  |  C  |  .  It  is  observed  that  and 

(i^,i^,...)  are  nondecreasing  sequences.  The  faces  determined  by 
CA(i)  and  Cg(j  [V^]  (or  Cg(j)  and  CA(i [V^l)  are  tangent  to  A  and  B. 

They  are  faces  of  CH(A,B). 

6.3  On  the  SMM  with  N  Processors 

In  this  section  we  discuss  the  entire  convex  hulls  algorithm  in  three 

dimensions  on  the  SMM.  The  crucial  step  is  the  implementation  of  the 

merging  of  two  convex  polyhedra  as  described  in  the  previous  section.  We 

2 

show  that  the  merging  runs  in  time  0((logN)  loglogN)  with  N  processors, 

3 

which  gives  us  an  0((logN)  loglogN)  three-dimensional  convex  hulls  algorithm 
on  a  SMM  with  N  processors. 

(^In  the  actual  implementation,  the  operation  of  comparing  two  angles  will  be 
replaced  by  the  operation  of  comparing  the  negative  values  of  their 
cotangents . 
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6.3.1  Implementing  the  Merge  Algorithm 

We  now  present  a  top-down  implementation  of  the  merge  algorithm  on 
the  SUM.  First  we  have  to  determine  the  internal  faces.  The  following 
procedure  determines  which  faces  of  the  convex  polygon  A  are  internal. 
procedure  INTERNALA(A,B, t^) 

/*  Given  two  nonintersecting  convex  polyhedra  A  and  B,  for  each 
face  FA(i)  of  A,  determines  if  it  is  internal  to  the  convex 

hull  of  A  and  B;  it  sets  t  (i)  to  1  if  F.(i)  is  internal  and 
0  otherwise  */  A  A 


1.  transform  each  face  Ffl(j)  with  normal  vector  < ,b ^ >  into  a  point 

yj>  -  <v5rV- 

2.  construct  the  spherical  Voronoi  Diagram  Gg  for  the  set  Pg. 

3.  transform  each  face  FA(i)  with  normal  vector  (a^b^c^  into  two 
points  P£(i)  -  (ai,b1,ci)  and  P^(i)  -  (-a^-b^-c^ 

4.  for  each  i,  determine  the  nearest  neighbors  Pg(i")  and  Pg(i')  of  the 
points  ?A(1)  and  PA(i)  respectively  by  point  location  in  Gg. 

5.  for  each  i,  if  both  Fg(i")  and  Fg(i'),  the  associated  faces  of 
FA(i),  are  in  H(A,i)  (i.e.,  FA(i)  is  internal)  set  tA(i)  to  1; 
otherwise  set  tA(i)  to  0. 

The  transformations  in  steps  1  and  3  of  procedure  INTERNALA  can  be 

done  in  constant  time  with  |Fg(  and  |fa|  processors  respectively.  As 

discussed  in  Section  6.2.1,  the  construction  of  the  spherical  Voronoi 

Diagram  for  P  is  just  a  simple  transformation  from  B,  which  can  be  done 

in  constant  time.  In  Section  4.1,  we  have  given  a  point  location  algorithm 

2 

which  runs  in  time  0((logn)  loglogn)  on  a  SMM  with  max(n,m)  processors, 
where  n  is  the  number  of  vertices  in  the  graph  and  m  is  the  number  to  be 
located.  Therefore,  all  the  nearest  neighbors  in  step  4  can  be  determined 


2 

in  time  0((log|Fgj)  loglog)Fg|)  with  max(|FA| , iF^j )  processors.  Finally, 
step  5  runs  in  constant  time.  Thus,  the  internal  faces  of  A  are 
determined  in  time  0((log|Fg|)  loglog|FA|)  on  a  SMM  with  max(|FA| , [Fg] ) 
processors.  Similarly,  we  can  have  a  procedure  INTERNALS  ( A, B,tg)  which 
set  tg(j)  to  1  if  face  Fg(j)  is  internal;  and  set  to  0  otherwise. 

Knowing  the  internal  faces,  the  circuits  C.  and  C„  as  defined  in 
Section  6.2.2,  can  be  determined  in  time  0(log|EA[ loglog|CA| )  and 
0 (log | Eg | log log | Eg j )  respectively  as  follows. 
procedure  CIRCUITS (A, B) 

/*  determine  the  two  circuits  C.  and  C_  for  A  and  B  */ 
begin  A  B 

/*  C^  contains  edges  of  A,  each  of  which  is  shared  by  an  Internal 

face  and  an  external  face  */ 

C  -  E 
A  A 

foreach  i,  0  5  i  <  |ea|  do 

if  EA<i>  CFX]  is  internal  and  EA(i)[F2]  is  external 

then  t(i)  *“  1 
else  t(i)  *"  0 
call  EXTRACT l(CA,t) 

order  the  edges  in  CA  as  defined  in  Section  6.2.2 

/*  Cg  contains  edges  of  B  each  of  which  is  shared  by  an  internal 
face  and  an  external  face  */ 

srs 

foreach  i,  i  <  | Eg  J  do 

if  Eg(i) (FjJ  is  internal  and  Eg(i)[F2]  is  external 

then  t(i)  —  1 
else  t(i)  •“  0 
call  EXTRACT l(C_,t) 

order  the  edges  in  C_  as  defined  in  Section  6.2.2 

0 


end 
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The  face  determined  by  the  edge  CA(i)  and  the  vertex  CB(j^)[VjJ  la  a 

new  face  of  the  convex  hull.  Since  is  the  smallest  index  such  that 

9A(i,j^)  is  a  maximum  among  all  9A(i,j),  0  £  j  <  |CB|  and  using  the 

result  in  Section  2.1.2,  j^,  for  a  particular  i,  can  be  determined  in 

time  OCloglCgj)  on  a  SMI.  Since  ...)  ia  •  nondecreasing 

(|ca|/2)  (|C  |/4> 

sequence,  we  can  first  find  j  ;  then  find,  in  parallel,  j 

<IcaI/2>  C|c  1/2) 

in  the  intervals  [0,j  ]  and  [j  , J C_ j - 1 ]  respectively, 

and  so  on.  It  is  straightforward  to  see  that  it  takes  log|CA|  iterations 
to  obtain  all  j^’s.  We  can  obtain  all  j^’s  by  invoking  the  following 
procedure  with  a  single  call  FIND—j^l(0,  |CA|  -1,0,  | Cg |  -1)  ; 
procedure  FIND_j ^l(a,b,c,d) 

/*  determine  in  the  range  [,c,dj  for  each  i  in  [a,b]  */ 

begin  if  b-a  «  0  then  return 

/*  determine  where  i  is  in  the  middle  of  [a,b]  */ 

i  «-  (a  +  b)/2 

j(t)  -  MINIMUM  (Cj|c  <  j  <  d  and  9A(i , j )  - 
MAXIMUM  (C8A(i,k),  c<k$  d])}) 

/*  partition  the  ranges  at  i  and  j(i),  and  apply  the  procedure 
recursively  to  these  sub-ranges  */ 

call  FTND-j(1)l(0,i-l,c,j(i)) 

call  FIND-j(l)l(i  +  l,b,j(i),d) 

end 

Similarly,  we  can  have  an  0(log|CB|  logjCA|)  time  procedure 

FIND_i^l  to  produce  all  i^'s.  We  are  now  about  to  present  the  entire 

2 

merge  procedure  which  runs  in  time  0((logN)  loglogN)  with  N  processors. 


rocedure  MERGING 1 (A, B) 

/*  merge  A  and  B,  store  the  resulting  convex  hull  in  C  */ 

Fc  *“  fa  u  fb*  ^  -  ea  u  eb 

/*  determine  internal  faces  */ 
call  INTERNALA (A , B , tA) 

call  INTERNALB(A,B,tB) 

/*  determine  the  new  faces  formed  by  C.  (i)  and  C_(j^^)[V-] 

m  A  a  i 

or  CB(j)  and  */ 

call  FIND.j(i)l(0,|CA|-l,0|CB|-l) 
call  FINDJ. (j }  1(0,  j CA| -1, 0,  | Cg | -1) 

/*  remove  all  internal  faces  and  edges  bounding  two  internal 
faces  */ 

remove,  from  Fc ,  faces  with  tA  or  tg  *  1 
remove,  from  ,  edges  EA(i)  such  that  both 
EA(i)[F1l  and  EA(i)[F2l  have  tag  tA  -  1, 
and  edges  EB(i)  such  that  EB (i) [F^]  and 
EB(i)(F2]  have  tj  •  1. 

/*  add  new  faces  and  edges  */  ... 

add,  to  F^,  faces  determined  by  CA(i)  and  C^(j^  ')[V^] 

and  faces  determined  by  C_(j)  and  C.(i^)[V..] 

o  A  i. 

add,  to  Ec,  edges  (^UHVj.CgU^MV^), 

(cA(i)[v2],cB(j(i))[v1]),(cB(j)[v1],cA(i(j))tv1l), 

and  (CB(j)(V2l,CA(i(J))[V1])  . 


6.3.2  Three-Dimensional  Convex  Hulls  Algorithm 


As  a  preliminary  step,  we  sort  the  set  S  of  N  points  by  their  y 
coordinates  in  ascending  order.  This  can  be  done  in  time  O(logNloglogN) 
with  N  processors.  We  now  present  the  recursive  program  for  determining 


the  three-dimensional  convex  hull  of  S . 


function  CH3(S) 


/*  return  CH(S)  where  S  is  a  set  of  N  points  in  three  dimensions  */ 
begin  if  N  S  2  then  return  (S ) 

else  return  (MERGINGl(CH3(S(0:N/2-l)),CH3(S(N/2:N-l))) 

end 

The  running  time  T(N)  of  function  CH31  can  be  obtained  from  the 
recurrence  relation  T(N)  S  T(N/2) +M(N) ,  where  M(N)  is  the  running  time 

of  function  MERGINGl.  In  the  previous  section,  we  have  shewn  that  M(N)  is 

2  3 

0((logN)  loglogN)  with  N  processors,  thence,  T(N)  *  0((logN)  )loglogN). 

Theorem  6.1.  The  convex  hull  of  a  set  of  N  points  in  the  three  dimensional 

3 

space  can  be  determined  in  time  0 ((logN)  loglogN)  on  a  SMM  with  N 
processors  and  N  memory  units. 

6 .4  On  the  CCC  with  N  Processors 

The  main  purpose  of  this  section  is  to  discuss  the  implementation  of 
the  merge  algorithm  on  a  CCC.  Ue  shall  first  develop  a  parallel  algorithm 
for  finding  the  maxima  of  several  sets  of  numbers.  This  will  be  used  in 
the  implementation. 

6.4.1  Finding  Maxima  of  Multiple  Sets 

Given  an  array  0(0:  n-1)  of  numbers,  which  is  partitioned  into  m 

subarrays  Dn,D.,...,D  ,  such  that  the  concatenation  D,  •  ...  •  D  ,  * 

u  x  m- 1  u  i  m- 1 

0(0:  n-1),  we  want  to  find  the  maximum  of  each  D^.  We  as^me  n  is  a  power 
of  2.  We  logarithmically  partition  each  0^  into  at  most  2  logn-1  segments 
by  means  of  a  segment  tree  T(0,n)  [281,  which  consists  of  a  root  V 
representing  an  integer  interval  [0,a],  and  of  a  left  subtree  T(0,  |_n/2J)  and 
a  right  subtree  T(l_n/2J  +  l,n)  (refer  to  Section  4.1  for  more  details).  For 
example,  Di  •  CD(7) ,0(8) .... ,0(13)} ,  a  subarray  of  D(0:  31),  is  partitioned 
into  { Cd ( 7 ) } ,{D(8) ,... ,0(11)} , [D(12) ,0(13)}} .  We  first  find  the  maximum  of 
each  of  these  segments  (to  be  referred  to  as  submaxima) .  We  then  find 
the  "»m>  M(i)  among  the  submaxima  of  the  same  array  D^. 
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We  now  outline  the  procedure  that  determines  the  maxima  M(j)  of 
D j  for  0  <  j  <  m  (we  shall  present  the  program  In  the  appendix) . 

1.  Logarithmically  segment  each  subarray  by  means  of  a  segment  tree  T(0,n). 

2.  Determine  the  maxime  of  the  segments  by  an  ASCEND  program:  at 
iteration  k,  k  -  0, . . . ,logn-l,  If  D(l)  and  D(i + (l-BITk(l))2^)  belong 
to  the  same  segment,  change  D(i)  to  the  larger  of  the  two;  at  the  end 
of  logn  iterations,  every  position  of  a  segment  contains  the  maximum 
of  that  segment. 

3.  Extract  the  submaxima  obtained  In  step  2. 

4.  Determine  the  maxima  of  the  sets  of  submaxima  of  same  subarray. 

As  discussed  in  the  planar  point  location  algorithm,  the  intervals  can 

be  logarithmically  segmented  in  time  0(logn)  on  a  CCC  with  n  processors. 

Step  2  is  an  ASCEND  program  which  runs  in  logn  steps.  Data  extraction 

discussed  in  Section  2.2.1  runs  in  time  0(logn)  on  a  CCC  with  n  processors. 

Since  each  subarray  is  segmented  into  at  most  21ogn-l  segments,  there  are 

at  most  21ogn-l  submaxima  in  each  subarray.  Therefore,  the  maxima  of  the 

of  the  same  subarray  can  be  determined  in  time  O(logn). 

Theorem  6.2.  The  maxima  of  each  of  subarrays  Dq,D^,  . . .  of  D  where 

the  concatenation  Dn»  D,  •  ...  »D  .is  the  array  D  of  n  elements,  can  be 

u  l  m-i 

found  in  time  O(logn)  on  a  CCC  with  n  processors. 
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6.4.2  Implementing  th«  Mtrg  Algorithm 

We  now  discuss  how  the  merge  algorithm  can  be  implemented  on  a  CCC 

3 

with  N  processors  in  time  0((logN)  ).  The  procedures  INTERNALA  and 
INTERNALS  in  Section  6.3.1  £or  determining  the  internal  faces  of  polyhedra 
A  and  B  can  be  implemented  on  a  CCC  with  N  processors.  The  most  time- 
consuming  step  is  determining  all  nearest  neighbors  which  involves  the 
point  location  algorithm  in  Section  4.2.  With  the  result  in  Section  4.2, 

3 

the  Internal  faces  can  be  determined  in  time  0((logN)  ). 

We  have  to  modify  slightly  the  procedure  CIRCUITS  in  Section  6.3.1, 

for  determining  the  two  circuits  C^  and  C^,  so  that  it  can  be  implemented 

on  a  CCC.  We  have  to  use  procedure  EXTRACT2  in  Section  2.2.1  for  data 

2 

extraction  and  the  ordering  takes  0((logN)  )  time  on  a  CCC.  Therefore, 

2 

the  circuits  are  determined  in  time  0((logN)  >  on  a  CCC  with  N  processors. 

In  implementing  the  procedure  for  finding  the  and  i^  for  the 

circuits,  we  have  to  use  the  algorithm  in  the  previous  section  for  finding 

the  maximums  of  multiple  sets  on  a  CCC.  Therefore,  and  i^  can  be 

2 

determined  in  time  0((logN)  )  on  a  CCC  with  N  processors. 

The  steps  in  the  procedure  MERGING1  (Section  6.3.1)  can  be  modified 
according  to  the  above  discussion  and  be  implemented  on  a  CCC  with  N 

3 

processors  in  time  0((logN)  ).  Using  the  same  recursive  program  CH3 

4 

in  Section  6.3.2  with  this  modified  merge  procedure,  we  have  an  0((logN)  ) 
time  algorithm  for  determining  the  three-dimensional  convex  hull. 


121 


Theorem  6.3 .  The  convex  hull  of  a  sec  of  N  points  in  Che  three-dimensional 

4 

space  can  be  determined  in  time  0((logN)  )  on  a  CCC  with  N  processors. 

1  4* 

6.5  On  the  CCC  with  N  Processors 

In  the  process  of  merging  two  convex  hulls,  the  point  location  used 

in  determining  all  nearest  neighbors  is  the  most  time-consuming  step. 

It  can  be  done  in  time  0(j(logN)^)  on  a  CCC  with  processors  (refer 

1  2 

to  Section  4.3),  where  0  <  or  ^  1.  Therefore,  we  have  a  0(— (logN)  )  time 

01 

1  3 

merging  algorithm  which  yields  an  0(—  (logN)  )  time  algorithm  for  finding 
the  three-dimensional  convex  hull. 

Theorem  6.4.  The  convex  hull  of  a  set  of  N  points  in  the  three-dimensional 
space  can  be  determined  in  time  0(— (logN)^)  on  a  CCC  with  N^  +  af  processors, 
where  0<aS  1. 
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CHAPTER  7 

VORONOI  DIAGRAMS  FOR  POINTS  IN  THE  EUCLIDEAN  PLANE 
A  Voronoi  diagram  of  a  sec  S(0:  N-l)  of  N  points  in  the  Euclidean 
plane  is  a  partition  of  the  plane  into  N  convex  polygonal  regions  R(0:  N-l) 
(refer  to  Figure  36).  For  each  point  S(i),  the  convex  polygonal  region 
R(i)  is  the  locus  of  points  closer  to  S(i)  than  the  other  N-l  points  of  S. 

The  vertices  of  the  diagram  are  called  Voronoi  points;  and  the  line  segments 
are  Voronoi  edges .  The  polygonal  boundaries  of  the  regions  are  called 
Voronoi  polygons. 

The  problem  of  the  construction  of  planar  Voronoi  diagrams  arises  in  many 
areas;  one  of  the  most  important  applications  is  in  nearest  neighbor  problems. 
Shames  and  Hoey  [35]  present  an  O(NlogN)  "divide  and  conquer"  algorithm  for 
construction  of  a  planar  Voronoi  diagram.  Brown  [8]  describes  an  O(NlogN) 
time  algorithm  which  can  be  extended  to  higher  dimensions.  His  result  is 
that  a  two-dimensional  Voronoi  diagram  of  N  points  can  be  constructed  by 
transforming  the  points  to  three-dimensional  space,  constructing  the 
convex  hull  of  the  transformed  points,  and  then  transforming  back  to 
two-dimensional  space. 

In  this  chapter  we  use  Brown's  technique  to  develop  parallel 
algorithms  for  constructing  planar  Voronoi  diagrams  on  the  SMM  and  on  the 
CCC. 

7.1  Definitions  and  Preliminaries 

In  this  section  we  describe  how  to  represent  a  Voronoi  diagram, 
review  some  important  properties  of  the  Voronoi  diagram,  and  define  the 
inversion  transform  which  will  be  used  in  the  construction  of  the  Voronoi 

diagram. 
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7.1.1  Representation  of  Voronol  Diagrams 

Lee  V(0:  |v|-l)  and  E(0:  | E | -1)  be  the  seta  of  Voronol  points  and 
of  Voronol  edges,  respectively,  of  the  Voronol  diagram  of  S(0:  N-l), 
where  |vj  ^  2N-4  and  |e|  5  3N-6.  Each  element  V(l)  contains  the  following 
Information:  V(i)[x],  V(i)[y]  which  are  the  coordinates  of  the  Voronol 
points  V(i) ,  and  V(1)[ADJ],  the  adjacency  list  of  V(l).  Elements  E(l) 
contains  the  two  original  points  that  determine  Voronol  edge  E(l).  By 
constructing  the  Voronol  diagram,  we  also  mean  obtaining  the  set  of  Voronol 
polygons  in  standard  fora;  P^(0:  Jp^J-1)  is  the  Voronol  polygon  relative 
to  point  S (1) . 

7.1.2  Properties  of  Voronol  Diagrams 

We  now  review  some  important  properties  of  Voronol  diagrams  which 
are  exploited  in  the  algorithm  of  Brown.  Each  Voronol  point  V(l)  of  the 
Voronol  diagram  for  S  is  equidistant  from  the  three  points  of  S  which 
are  closest  to  V(i).  The  circle  determined  by  these  three  points  is 
centered  at  V(i)  and  contains  no  other  points  of  S.  Furthermore,  if  the 
circle  determined  by  any  three  points  of  S  does  not  contain  any  other 
points  of  S  (these  three  points  are  said  to  be  satisfying  the  circumcircle 
property),  then  the  center  of  the  circle  is  a  Voronol  point.  A  Voronol 
edge  is  the  perpendicular  bisector  of  the  line  segment  joining  two 
points  of  S,  which  are  on  the  same  circumcircle. 
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7.1.3  The  Inversion  Transform  • 

The  geometric  transform  used  by  the  algorithm  is  called  Inversion. 

The  inversion  is  an  involutory  point-point  transformation  determined  by 
two  parameters,  the  center  of  inversion  Pq  and  the  radius  of  inversion  r. 
The  image  of  a  point  Q  under  the  inversion  is  another  point  Q',  where 
PqQ  and  PqQ'  are  in  the  same  direction  and  the  magnitude  IPqQ'I  **2/|PqQ|- 
For  example,  that  the  center  of  inversion  is  the  origin  and  that  the  radius 
of  inversion  is  one,  then  under  this  Inversion,  in  the  plane,  the  imege 
of  a  point  with  polar  coordinates  (R,9)  is  (1/R,9);  and  in  the  space,  the 
image  of  (R,9,0)  is  (1/R,9,0).  The  inversion  transforms  any  sphere  which 
passes  through  the  center  of  inversion  to  a  plane  which  does  not  pass 
through  the  center  of  inversion,  and  vice  versa.  For  example  with  the 
center  of  Inversion  at  a  point  Pq  not  on  the  xy-plane  and  radius  >  0, 
the  xy-plane  transforms  to  a  sphere  with  PQ  at  the  apex.  Another  property 
of  inversion  is  that  the  interior  of  the  sphere  transforms  to  a  half¬ 
space  bounded  by  the  plane  which  is  the  image  of  the  sphere,  and  the 
exterior  of  the  sphere  transforms  to  the  other  half-space. 

7.2  The  Voronoi  Diagram  Algorithm 

In  this  section,  we  shall  describe  how  the  techniques  of  embedding 
g^into  three  dimensions,  inversion,  and  the  three-dimensional  convex  hull 
algorithm  are  used  to  construct  the  Voronoi  diagram  of  a  set  S  of  points 
in  the  xy-plane. 

Let  S'  be  the  set  of  inversion  points  of  S  with  center  at  an  arbitrary 
point  ?q  not  in  the  xy-plane  and  radius  1.  Since  ail  points  of  the 
xy-plane  are  mapped  to  a  sphere  with  Pn  at  the  apex,  all  points  of  S'  are 
on  this  sphere  and  they  will  be  on  the  convex  hull  of  S’.  Observe  that 


any  three  points  of  S  satisfying  the  circumcircle  property  determine  a 
face  F  of  the  convex  hull.  This  happens  because  the  other  N-3  points 
of  S  are  exterior  to  the  circle  determined  by  these  three  points,  that 
is,  exterior  to  the  sphere  with  ?q  at  the  apex  and  intersecting  the 
xy-plane  in  that  circle  (refer  to  Figure  37).  Therefore,  after  the 
Inversion,  the  other  N-3  points  will  be  in  the  same  half-space  bounded 
by  the  plane  F.  Therefore,  we  can  find  the  Voronoi  points  as  follows: 
we  invert  each  face  F^  of  the  convex  hull  of  S '  into  the  corresponding 
sphere,  which  will  intersect  the  xy-plane  in  a  circle.  The  center  of 
this  circle  is  a  Voronoi  point  if  Pq  and  the  convex  hull  are  in  the  same 
half-space  whose  boundary  plane  contains  face  F^ . 

The  Voronoi  edges  are  constructed  by  connecting  appropriate  pairs  of 
Voronoi  points.  Suppose  faces  F^  and  F^  of  the  convex  hull  meet  at  an 
edge  of  the  hull,  then  there  will  be  a  Voronoi  edge  from  V^  to  Vj  when 
both  V^  and  V^  are  Voronoi  points.  However,  if  one  and  only  one  of 
and  Vj,  say  V^,  is  a  Voronoi  point,  then  there  will  be  an  infinite  ray 
starting  at  in  the  direction  of  V~V^  (unbounded  Voronoi  polygon). 

We  now  present  the  entire  Voronoi  diagram  algorithm  as  follows: 


procedure  CONSTRUCT_VD (S ) 


/*  construct  the  Voronoi  diagram  of  a  set  S(0:  N-l)  of  points 
in  the  xy-plane  */ 
begin 


/*  embed  each  point  (x,y)  of  S  into  (x,y,0)  */ 

foreach  i,  0  ^  i  <  N  do  begin 

S*(i)[xl  -  S(i)  } 
S*(i)[y]  -  S(i)(y] 
S*(i) [z]  -  0 

end 


128 


/*  choose  the  center  and  radius  of  inversion  */ 
Pq  *-  some  arbitrary  point  not  on  the  xy-plane 

r  -  1 


/*  invert  points  in  S*  w.r.t.  Pq  and  r  */ 

2.  foreach  i,  0  ^  i  <  N  do  S'(i)  «-  inversion  of  S*(i)  w.r.t. 

3.  construct  the  convex  hull  CH  of  S ' 


/*  determine  the  Voronoi  points  */ 

4.  foreach  face  F^  of  CH  do 

begin  A^  •-  inversion  of  F^ 

V^  *-  center  of  the  circle  which  is  the  intersection  of 
A^  and  the  xy-plane. 

if  Pq  and  CH  are  in  the  same  half-space  bounded  by  F^ 
then  V.  is  a  Voronoi  point 

end 


/*  determine  Voronoi  edges  and  rays  */ 

foreach  each  edge  ,  bounding  Fj^  and  F^  of  CH  do 

if  V^  is  a  Voronoi  point 

then  If  Vj  is  a  Voronoi  point 

then  is  a  Voronoi  edge 

else  there  is  a  ray  starting  at  V^ 

in  the  direction  of  V.V. 

i-  J 

else  i£  Vj  is  a  Voronoi  point 

then  there  is  a  ray  starting  at  V 


in  the  direction  of  V.V. 

J  1 


j 


end 


obtain  the  fet  of  Voronoi  polygons. 


We  shall  show,  in  the  next  section,  that  this  algorithm  can  be 


Implemented  on  a  SMM  and  a  CCC 
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7.3  Implementing  the  Voronoi  Diagram  Algorithm  on  the  SMM  and  the  CCC 

We  first  show  that  the  algorithm  in  Section  7.2  can  be  implemented  on 

3 

a  SMM  with  N  processors  and  N  memory  units  in  time  0((logN)  loglogN) .  The 
embedding  into  three  dimensions  is  clearly  achievable  in  constant  time 
with  N  processors  and  N  memory  units.  Each  independent  inversion  transform 
can  be  done  in  constant  time  on  one  processor.  Therefore,  steps  1,  2  and  5 
of  the  algorithm  run  in  constant  time.  It  is  not  difficult  to  show  that 
step  4  also  runs  in  constant  time.  The  most  time-consuming  step  is  step  5 
of  the  algorithm  which  requires  the  construction  of  the  convex  hull.  We 
have  shown  in  Section  6.3  that  the  three-dimensional  convex  hull  can  be 
constructed  on  a  SMM  with  N  processors  and  N  memory  units  in  time 

3 

0((logN)  loglogN).  The  final  step  which  obtains  all  the  Voronoi  polygon 

involves  grouping  and  sorting  the  edges.  This  can  be  done  in  time 

O(logNlaglogN) .  Therefore,  we  have  the  following  result. 

Theorem  7.1.  The  Voronoi  diagram  of  a  set  of  N  points  in  the  plane  can  be 

3 

constructed  in  time  0((logN  loglogN)  on  a  SMM  with  N  processors  and  N 
memory  units. 

As  we  discussed  in  the  previous  paragraph,  the  construction  of  the 
convex  hull  in  three  dimensions  is  the  most  time-consuming  step  of  the 

4 

algorithm.  In  Sections  6.4  and  6.5,  we  have  presented  an  0((logN)  )  and 
1  3 

an  0(— (logN)  )  three-dimensional  convex  hull  algorithms  for  the  CCC  with 
1  + 

N  processors  and  N  processors,  respectively.  And  it  is  straightforward 

2 

to  show  that  all  other  steps  of  the  algorithm  require  at  most  0((logN)  > 
for  N  processors  and  0(^  logN)  for  N*+a  processors.  Therefore,  we  have 


the  following  results . 
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Theorem  7.2.  The  Voronoi  diagram  of  a  set  of  N  points  in  the  plane  can 

4 

be  constructed  in  time  0((logN)  )  on  a  CCC  with  N  processors. 

Theorem  7.3.  The  Voronoi  diagram  of  a  set  of  N  points  in  the  plane  can 
be  constructed  in  time  0(^(logN;)  on  a  CCC  with  N1  a  processors. 


where  0<»$  1. 
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CHAPTER  8 
CONCLUSION 

It  has  been  demonstrated  in  this  thesis  that  in  solving  certain 
geometric  problems,  operations  can  be  performed  in  parallel  to  sub¬ 
stantially  reduce  the  computation  time.  Using  the  Shared  Memory  Machine 
of  Section  1.1.1,  parallel  algorithms  have  been  developed  to  solve  the 

problems  of  reporting  all  Intersecting  pairs  of  rectangles  in  time 

2  2 
0((logN)  ),  planar  points  location  in  time  0((logN)  loglogN),  constructing 

2 

two-dimensional  convex  hulls  in  time  0((logN)  ),  three-dimensional 

3 

convex  hulls  in  time  0((logN)  loglogN),  and  constructing  planar  Voronoi 

3 

diagram  in  time  0((logN)  loglogN).  Using  the  Cube-Connected -Cycles 

with  a  number  of  processors  linear  in  problem  size,  the  parallel  algorithms  -  - 

developed  for  all  of  these  problems,  except  reporting  intersecting  pairs 

of  rectangles  and  constructing  two-dimensional  convex  hull,  have  time 

complexity  only  increased  by  a  factor  of  logN/loglogN.  The  algorithms 

2 

for  the  two  exceptional  problems  have  time  complexity  0((logN)  )  which 
is  the  same  as  that  on  the  SMM.  With  an  increase  in  the  number  of 

1  +  Ctf  - 

processors  of  the  CCC  to  N  (0  <  a  1),  all  of  the  problems  can  be 
solved  with  parallel  algorithms  of  time  complexity  improved  by  a  factor 
of  l/(alogN)  with  respect  to  the  time  complexity  of  the  algorithms  on  the 
CCC  with  N  processors.  In  contrast,  the  best  sequential  algorithms  for 
all  of  these  problems,  except  planar  point  location,  have  a  worst  case 
time  complexity  of  O(NlogN).  The  best  sequential  algorithms  for 
locating  M  points  in  a  graph  of  N  vertices  has  time  complexity  0((M  +  jf  )logN) . 
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la  parallel  computation,  it  la  possible  that  some  processors  are  not 
always  busy.  It  has  been  shown  that  the  algorithms  presented  here  for 
finding  the  two-dimensional  convex  hulls  and  reporting  Intersecting  pairs 
of  rectangles  are  not  only  fast,  but  involve  relatively  little  waste 
as  well. 

The  results  in  this  thesis  indicate  that  geometric  problems  are 
susceptible  of  being  solved  efficiently  on  parallel  computer  systems. 
Moreover,  once  again,  the  Cube-Connected-Cycles  is  shown  to  be  suitable 
for  implementing  algorithms  for  an  expanding  class  of  problems. 

We  conclude  this  thesis  by  presenting  the  results  in  Table  1. 
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APPENDIX 


procedure  CONSTRUCTS  (S) 


/*  determine  F 


logN’ 


.,Fq  for  the  points  in  S  */ 


/*  the  root  F^^  is  the  set  S  sorted  by  their  x-values  */ 
FlogN  *“  S 

sort  Flogjj  by  their  x-values 

foreach  j,  0  <  j  <  n  do  N^logN(J)  *"  0 

/*  determine  FiogN_i» • * • »FQ  ° ne  at  a  time  */ 

for  i  *■  logN  downto  I  do 
begin 


/*  determine  the  node  numbers  I#^_^  in  the  next  level 

i-1  for  each  point  */ 
foreach  j ,  0  Ss  j  <  n  do 

begin  Fi-1(j)  -  F£ (J ) 

TEMP(j)  -  Ft(j) 

N#i-l(j)  “  mi(j) 
t^j)  -  t2(j)  -  0 

if  y- value  of  F± (J )  *  Bt_1(N#t(J>) 

then  tx(j)  *-  1 

else  begin  t2(j)  •-  1 

TEMPN*(j)  -  N#  (j) +2l0sN_i 
end  1 

end 

/*  rearrange  the  points  according  to  their  node  number  */ 
call  EXTRACT2(F^-1,t1);  call  EXTRACT2(t#t_1,  t^ 

call  EXTRACT2 (TEMP , t2 ) ;  call  EXTRACT2 (TEMPI# , t2 ) 

foreach  J,  OS  j  <  |TEMP|  do 

begin  Fi-1(j  +  (f^I  )  -  TEMP(j) 

-  TEMPI# (j) 

end 


end 


end 
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procedure  INTERS ECT3(V,H) : 


/*  search  all  intersecting  pairs  of  horizontal  line  segments  in  H  and 
vertical  line  segments  in  V  */ 
begin 


/*  construct  the  search  structures  D-i /^jD,  ,  , , . . .  ,Hn  for  V  */ 

call  CONS  TRUCT^l  (V )  U  0 

/*  H' ,  the  set  of  horizontal  line  segments,  is  maintained  sorted 
lexicographically  by  their  node  numbers  and  the  x- values  of 
their  left  endpoints .  */ 

H*  -  H 

sort  H'  by  x-values  of  left  endpoints 
foreach  j,  0  S  J  <  m  do  NN(j)  —  0 

foreach  j ,  m  ^  j  <  do  H'  (j)  *"  null 

/*  search  in  £•  beginning  at  Djy */ 
for  i  —  downto  0  do 

begin  call  RANGE_SEARCH_1D (d^,H'  ) 

/*  determine  node  numbers  of  the  horizontal  line  segments 
and  reorder  H'  according  to  these  node  numbers  */ 

for  k*-  log  2m  to  log  2mNa-l  do  /*  duplicate  H'  Na  times  */ 

if  BITk(j)  -0  then  begin  H'(j+2k)  -  H'(j) 

NN(j  +  2k)  -  NN(j) 

end 

foreach  j,  0  S  J  <  2x0®^  do  /*  determine  node  numbers  */ 

begin  t(j)  -  0 

NN(j)  -  NN(j)+  Lj/2mJNA  “ 
if  B^^NNUJJSy-value  of  H'  (j)  ^T^_^(NN(j)) 

then  t(j)  -  1 
end 

call  EXTRACT2(H',t);  call  EXTRACT2 (NN , t )  /*  reordering  */ 

end 


end 
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procedure  CO^iSTRUClXS) 


/*  construct  the  arrays  Gl/a»***»Go  for  **•  set  s  o£  Poiats  */ 
begin 


/*  the  root,  G^, 


is  the  set  S  sorted  by  x-coordinates  */ 


°i/«  '  s 

sort  by  their  x-values 
foreach  j,  0  <  j  <  a  do  *“  G 


/*  determine  . GQ  one  by  one  in  descending  order  */ 


for  i  l/a  downto  1  do 
begin 


/*  ^  is  obtained  by  reorder  G t  as  follows  */ 

foreach  j ,  0  i  j  <  nN®  do 
begin  Gt-1(J)  -  M^CJ) 

N#i^i(J)  -  N^U) 

t(j)  -  0 


/*  duplicate  Gi  into  N®  copies  */ 
for  k  -  logn  to  lognN® -1  do 

if  BlTk(j)  -  0  then  begin  Gi_1(j  +2k)  -  G^Cj) 

H»t.l(J  +2k)  -  H*t-1(J) 

end  * 


/*  determine  node  numbers  of  each  point  in  G^  ^  */ 

foreach  j ,  Oi  j  <  nN®  do  .  . 

begin  -  N#t-1(j)  +  L  j/niN 

if  iy-value  of  Gt-1(j)S 

Ti-l(l#i.l(J» 

then  t(j)  *“  1 

end 


/*  reorder  the  points  according  to  their  node  numbers 
and  x-coordinates  */ 
call  EXTRACT2(Gi_1,t) 

call  EXTRACT2 (N#  . , t) 

end  1 

end 
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procedure  RANGE_S  EAE.CH3  (S ,  Q  ) 


/*  report  all  points  a  €  S  such  that  Q(i)[L]  ^  x(a)  ^  Q(i)[R] 
and  Q(i)[B]  £  y(a)  ^  Q(i)[T]  for  every  Q(i)  */ 


/*  construct  the  search  arrays  J':G1  .  ,  ...,Gn  for  S  */ 
call  CONS TRUCT.XS)  /0( 

/*  Q*  is  the  set  Q  sorted  by  Q (i) £1-1  */ 

Q'  -  Q 

sort  Q*  by  x-valugs  of  left  endpoints 
foreach  j,  0  <i  1  <  m  do  UNO)  —  0 

f oreach  j,  m  ^  j  <  2ml/*  do  Q1  (j)  null 


/*  search  in 

for  i  *■  1/a  down  to  0  do 
begin 


one  at  a  time  */ 


/*  determine  Q"  which  is  a  subset  of  queries  which  can 
be  answered  at  this  level.  The  remaining  queries 
determine  the  node  numbers  in  the  next  level  */ 

foreach  j ,  0  ^  j  <  do 

begin  tjQ)  -  t2(j)  -  0 

Q"U)  -  Q1  (j) 

NN"(j)  -  NN(j) 
if  Q1 (j_  4  null 

then  if  Q*  (j)  [B]  ^Bi(NN(j))  and 

(NN(j))  <  Q’(j)[Tl 

then  t^j)  -  1 

else  t,(j)  -  1 
end  41 

call  EXTRACT2(Q",t1);  call  EXTRACT2 (NN",t^) 

/*  answer  queries  in  Q"  by  performing  a  one-dimensional 
range  searching  on  i  */ 
call  RANGE_SEARCR_lD(Gi,Q") 

/*  extract  Q'-Q"  from  Q'  and  reorder  the  queries 
according  to  their  node  number  */ 
call  EXTRACT2(Q',t2);  call  EXTRACT2  (NN,  t2> 

for  k  —  log2n  to  log2mNa-l  do 

foreach  j ,  0  ^  j  <  2mNa  do 

if  BIT.  (j )  -  0  then  . 

begin  Q,(j  +  20  -  Q’(1) 

NN(j  +  2*)  -  NN(j) 
end 


end 


foreach  j ,  0  ^  j  <  2mNa  do 

begin  C(j)  -  0  ]_ia 

NN(J)  -  NN(j)  +  L  j/2mJNA  lQf 
if  (Q,(j)*tB]<Ti.1(NN(j))  or 

Q'(j)*[Tl  >  Bi_1(NN(j))) 

and  Q'  (j)  +  null 
then  t(j)  *“  1 

end 

call  EXTRACT2 (Q ' , t) ;  call  EXTRACT2 (NN , t) 


end 
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procedure  CONSTRUCTS  (E) 


/*  construct  the  point  location  tree  for  the  set  (0:|e|-1)  of  edges  */ 
begin 


/*  C^  ,  is  a  subset  of  edges  which  may  belong  to  NODE^(j)  */ 
foreach  k,  0  <  k  <  |E [  do  ClogN>Q(k)  -  E(k) 

/*  determine  the  nodes  of  T,  level  by  level  */ 
for  i  ♦-  logN  down  to  0  do 

/*  extract  the  appropriate  edges  from  C^  ^  to  form  NODE^(j); 
then  form  Ci_1  and  C^^  2^+1  from  the  remaining  edges  */ 

foreach  j ,  0  ^  j  <  2*  1  do 
begin  NODEi ( j )  -  0 

Ci-l,2j  ”  Ci-l,2j+l  *"  Ci,j 
Ct  ^  +  0  then 

begin 


/*  extract  from  C^  ^  edges  that  belong 
to  NODEi(j)  */ 

foreach  k,  0<k<  ( C^  |  do 

if  C1>j(k)[Bl  <  B’(j)  and 

Tt(j)  S  Ctj(k)[T] 

then  t(k)  **  1 
else  t(k)  *"  0 
call  EXTRACT l(Ct  jt) 

NODE^j)  -  Ct 

sort  edges  in  NODE^Cj)  in  Che 
positive  x  direction 


/.  detftrmln*  CH1J  »nd  */ 

foreach  k,  0  £  k  <  |Ct_1  2j |  do 

begin  if  t(k)  *  0  and 

Ci-l,2j(k>[B]  <  Bi-l(2j> 
then  t^(k)  -  1  else  t^(k)  -  0 

if  t(k)  -  0  and  C^  2j(k)[T] 

>  Bi_l(2j+1) 
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then  (k)  -  1  else  1 2 (k)  -  0 


call  EXTRACTl(C.  .  ) 

call  nnucri(ct.li2j+1<t2) 


procedure  L0CATE1(G,P) 

/*  locate  the  set  of  points  P(0:  M-l)  in  the  planar  subdivision 
induced  by  the  graph  G  »  (V,E)  */ 
begin 

/*  construct  the  point  location  tree  7  for  the  edges  of  G  */ 
call  CONSTRUCT-J'Z  (E  ) 

/*  JqOO  and  Jj_(k)  are  the  indice  of  the  nodes  which  we  have 

to  search  for  point  P(k);  L(k)  and  R(k)  are  edges  on  the 
left  and  right,  respectively  of  P(k)  */ 

foreach  k,  0  <  k  <  M  do 

begin  JQ(k)  -  0;  J^k)  -  -1 

L(k)  - 

R(k)  -  i 


/*  search  in  7  one  level  at  a  time  */ 
for  i  lcgN  downto  0  do 
for  l  —  1  to  1  do 

foreach  k,  0  ^  k  <  M  do 
if  Jg(k)  2  0  then 

begin  TEMPL(k)  -  edge  in  N0DEi(J^(k))  that  is 

closest  to  and  left  of  P(k) 
TEMPR(k)-  edge  in  MODE^  (J^  (k) )  that  is 

closest  to  and  right  of  P(k) 
if  TEMPL(k)  is  right  of  L(k)  then 

LOe)  -  TEMPL(k) 

if  TEMPR(k)  is  left  of  R(k)  then 

R(k)  -  TEMPR(k) 


if  L(k)  and  R(k)  bound  Che  same  region 

then  begin  P(k)  is  Che  region  bounded 

by  L(k)  and  R(k) 

*“  00  *”  “i 


J^(k)  -  *"  “1 

end 

else  if  y-value  of  P(k)  ■  T^_^(2j^(k)) 
then  begin  J£01(k)  -  UjL  (k)  +  1 
end 

else  if  y-value  of 

P(k)  <  Tt_1(2Ji(k)) 

then  Jx(k)  -  2j£(k) 

else  J^(k)  -  2j£(k)  +  1 


procedure  C0NSTRUCT_52(E) : 

/*  determine  the  search  structure  E,  En  for  the  set  E  of  edges  */ 

begin  logN  ° 


/*  S  is  the  set  of  edges  from  which  E^  is  formed  */ 

foreach  j,  0  ^  j  <  |e{  do  begin  S(j)  •"  E(j);  tt(J)  •“  0;  end 
foreach  j,  |e|  <  j  <  4|e"["  do  S(j)  •“  null 

/*  determine  Ei0gu> • • • »eq  one  on*  */ 

for  i  •"  logN  down  to  0  do 
begin 

/*  determine  the  edges  in  E^  */ 

foreach  j,  0  S  J  <  4|e|  do 
begin  t^j)  -  t2(j)  -  0 

Et(j)  **  S(j);  N#t(j)  -  rr(j) 

if  S(j)  4  mill  then 

if  S(j)[BlSBt(TT(j))  and  T^ttCJ))  £S (j) [Tj 


then  t^Q)  -  1 
else  t,(j)  -  1 

end 

call  EXTRACT2(Ei,t1);  call  EXTRACT2(N*t, tj) 

sort  both  E^  and  lexicographically  by  values  of 
and  positions  of  ( j )  in  the  direction  of 

positive  x. 

/*  determine  edges  which  may  belong  to  the  next  level 
of  I  */ 

call  EXTRACT2  (S ,  t2)  ;  call  EXTRACT2 (tt,  t2) 

foreach  j,  OS  j  <  4|e|  do 
begin  TEMP(j)  -  S(j) 
tx(j)  -  t2(j)  -  0 

if  S (j ) [B]  <  Xi_1(n(j))  then  ^(j)  -  1 
if  S(j)[Bl  >  Tt_  1(tt ( j ) ) 
then  begin  t2(j)  **  1 

TEMPrr(j)  -  2logN"i  +n(j) 

end 

end 

call  EXTRACT2(S,t1);  call  EXIRACT2(tt. t^ 

call  EXTRACT2(TEMP,t2);  call  EXTSACT2 (TEMPtt, t2) 

foreach  j,  OS  j  <  |TEMP[  do  begin  S(J  +  Isl)  -  TEMP(j) 

«(J  +  |S|)  -  TEMPir(j) 
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procedure  LOCATE 2  (G,P): 

/*  locate  the  set  of  points  P(0:  M-l)  in  the  planer  subdivision 
induced  by  G  •  (V,E)  */ 
begin 

/*  construct  the  search  structure  E,  „,E,  „  En  for  the 

set  E  of  edges  */  logN  lo*N“1  0 

call  CONSTRUCT.#  (E) 

/*  P'  is  the  set  of  points  to  be  located;  they  are  sorted  by 
their  node  numbers  and  then  x-coordinates  */ 
sort  P  by  x  coordlantes 
foreach  k,  0  $  k  <  2M  do 

begin  NN(k)  -  0;  P*  (k)  -  P(k) 

L(k)  -  E_. 

R(k)  -  E^ 

end 

foreach  k,  M  S  k  <  2M  do  P'  (k)  **  null 

/*  search  in  Eioglj»  •  •  •  >eq  0ae  at  a  time  until  edges  L(k)  and 

R(k),  for  each  k,  bound  the  same  region  */ 
for  i  ~  logN  dovnto  0  do 

begin  call  SEARCH (E ,  ,P' ,TEMPL)  /*  parallel  searching  in 

Section  2.2.3  */ 

call  SEARCH1(E^,P' ,TEMPR)  /*  modified  SEARCH  */ 

foreach  k,  OS  k<  2M  do 

begin  if  TEMPL(k)  is  right  of  L(k)  then 

L(k)  -  lEMPL(k) 

if  TEMPR(k)  is  left  of  R(k)  then 

R(k)  -  TEMPR(k) 
if  L(k)  and  R(k)  bound  the  same  region 

Chen  begin  P* (k)  is  in  the  region  bounded 
by  L(k)  and  R(k) 

P'(k)  -  null 

end 

tx(k)  -  t2(k)  -  0 

TEMP(k)  -  P'(k) 

TEMPNN(k)  -  21°8N"i  +  NN(k) 
if  P' (k)  i  null  then 

begin  if  y-value  of  P* (k)  £  T^_^(NN(k)) 

then  t^(k)  -  1 


if  y-value  of  P' (k)  *  T^_^(NN(k) ) 

then  t5(k)  -  1 
end  4 

call  EXTRACT2(P',t1) 

call  EXTBACT2(NN,t1) 

call  EXrRACT2(TEMP,t2) 

call  EXTRACT2(TEMPNN,C2) 

foreach  k,  0  £  k  <  |TEMP|  do 

begin  P*  <| P*  |  +k)-TEMP(k) 

NN(|P'  |  +  k)  “ TEMPNN (k) 

end 

end 


rocedure  CONSTRUCT.#  (E  ) ; 
begin 

foreach  j,  0  £  j  <  |e|  do  begin  S(j)  -  E(j) ;  n(j)  -  0  end 

foreach  j,  |e|  S  j  <  2|e|n“  do  S(J)  -  null 
for  i  *“  l/o  downto  0  do 


begin 


foreach  j,  0  ^  J  <  2 { E j N*  do 

begin  tjU)  -  t2(j)  -  0;  DiCj)  -  S  (J)  -«(j) 

if  S(j)  *  null 

chan  if  S (J) [B]  iS  Bt(rr(j))  £  S (J) [T] 
than  tx( j)  -  1 
t2(j)  -  1 

and  * 

call  EXTRACT2(Dt,t1);  call  EXTRACT2 (N#t , t^) 

sore  both  D,  and  by  lexicographically  by  values  of 

and  position  of  D^(j)  in  positive  x  direction 

call  EXTRACT2  (S  ,  t, ) :  call  EXTRACT2  (rr.  t,) 

for  k  *“  log  2|e|  to  log  2|E(Na  -  1  do 

for  J,  0  £  j  <  2|E|N0f  do  . 

if  BITk(j)  -  0  then  begin  S(J+2*)  -  S(j) 

TT(j+2k)  -  TT(j) 
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foreach  j,  OS  J  <  iIeIn®  do  .  . 
begin  TT(j)  -  rr(j)  +  L  j/2oJ  N 
t<j)  -  0 

if  S(j)  *  null  end  (S(J)(B]  <  Tt-1(TT(j))  or 

S >  B, i(rr(J))) 
then  t(j)  ~  1  x  1 

end 

cell  EXTRACT2  (S ,  t) ;  cell  EXTRACT2  (rr,  c) 

end 

end 


procedure  LOCATE3(G,P) : 


/*  locate  the  set  of  points  P(0:  M-l)  In  the  planar  subdivision 
induced  by  G  */ 
begin  call  C0NSTRUCT_«S2(E) 

.  sort  P  by  x  coordinates 
foreach  0Sk<  |e|  do 
begin  P'(k)  -  P(k) 

NN(k)  -  o 
L(k)  -  E^ 

R(k)  -  E_®* 


end 

foreach  |e|  £  k  <  2|EjtP  do  P'(k)  -  null 
for  1  *”  1/a  downto  0  do 

begin  call  SEARCHfD, .P* .TEMPL)  /*  parallel  searching  in 

Section  2.2.3  */ 

call  SEARCH1(D1,P',TEMPR)  /*  modified  SEARCH  */ 
foreach  k,  0  £  k  <  2 [ E { do 


if  P(k)  j*  null  then 

begin  if  TEMPL(k)  is  tight  of  L(k)  then 

L(k)  -  TEMPL(k) 

if  TEMPR(k)  is  right  of  R(k)  then 

R(k)  -  TEMPR(k) 
if  L(k)  end  R(k)  bound  the  seae  region 
then  begin  P'(k)  is  in  the  region 
bounded  by  L(k) 
and  R(k) 

P’(k)  -  null 
end 

end 

for  j  *-  log  2  (E |  to  log  2|E|Na-l  do 

if  BITj)k)  -  0  then  begin  P'  (K  +  2j)  -  P*  (k) 

NN(k+2j)  -  NN(k) 

L(k+2J)  -  L(k) 

R(k  +  2J)  -  R(k) 

end 

foreach  k,  OSk<  2|Ejl^  do 

begin  t(k)  -  0  . 

NN(k)  -  NN(k)  +  Uc/2 jE|j  N 
if  P' (k)  +  null  and 

B.  1(NN(k))  <>  y-value  of 

P'(k)ST.  1  (NN  (k)  ) 
then  t(k)  -  1  w 

end 

call  EXIRACT2CP'  ,t);  call  EXXRACT2 (NN , t ) 

end 

end 
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function  TANGENTS  1 (A, B) 


/*  returns  the  indices  of  the  extremes  of  the  left  tengent  end  right 
tengent  of  A,B  where  A  end  B  ere  two  non-intersecting  convex 
polygons  end  y-coordinetes  of  vertices  in  B  >  those  in  A  */ 


begin 


/*  determine  the  renges  in  which  j*  end  i*  lie  */ 


if  x-velue  of  B(rB>  <  x-velue  of  A(rA> 

then  begin  e  -  0;  b  -  rA;  c  -  0;  d  -  rB;  end; 
else  begin  e  -  r^;  b  •“  s^;  c  •-  rB;  d  *"  sB;  end; 

/*  determine  et  selected  velues  of  1  */ 

foreech  1.  i€  ta  +  k,a  + 2k, . . .  ,a  +  (k-l)kl  do 

j(1)  -  MINJ/JBITONIC  <CY1>c»YlfC+1.---»Y1>d}) 

/*  i*  is  in  the  range  [i  -  k+  l,i  +  k  -  1] ,  determine  i*  and  J* 
in  this  range  */ 

I  -  MINIMUMl((i|  j(i)  S  J(h),  h-a  +  k,a  +  2k,  ...,a+  (k-l)k}) 
foreach  i,  i  €  £I-k+l,£-k  +  2,...,i  +  k-l}  do 

j(i)  -  MINJ/J1T0N1C  (tYi>c,Yt>c+1 . Yi>d)) 

j*  -  MINIMUMl(Cj^i)  |i  - I  - k  + 1 . i+k-  l}) 

foreach  i,  i  €  il-k+l,...,i  +  k-l3  do ,,  > 

a  >  vi. j*+i  '*  J*  ■ J  *' 

iai  >  \,j.  lai  «l(1+l  - n.j*  <  n  '*  wrc*  <2>  *' 

then  i*  •-  i 

/*  determine  the  ranges  in  which  j*  and  i*  lie  */ 
if  x~value  of  B(ig)  <  x-value  of  AOl^) 

then  begin  a  •“  s^;  b  •-  c  •“  sg;  d  **  Xg;  end; 

else  begin  a  •-  b  •-  n;  c  *•  J tg;  d  *-  m;  end; 

/*  determine  at  selected  values  of  1  */ 

k  •*  ^/b"»+l 

foreach  i.  i€  C*  +  k,a  +  2k, . . .  ,a  +  (k-l)k}  do. 

j(i)  -  MAX-AJBI TONIC  (CYi>c,Yi>c+1. • • • »Yidl> 
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/*  i*  is  la  Ch«  rang*  [£-lc+l,I+k-l] ,  determine  j*  and  i* 
in  this  range  *7 

i  -  MAXDlUMl(£ilj(i)  *  h  -  a+k,a+2k, . . .  ,a+(k-l)k}) 

for*ach  i,  i€  Ci  -  1 . i  +  k  -  l}  do 

J(l)  -  MAJLJS_BITONIC  (CYl  c,Yt>c+1,...,Yl  d}) 

j*  -  MAXD!UMl(tJ(i)|i  -k+1 . i  +  k-l}) 

for**ch  i,  i  €  t 

•'l.J.-l  S  vl,j«  >  n,3*rt  ^  *1,1+1  «  Yl,j*  *s± 

“1,1-1  -*1,3.* 

r«eura  (j*,  1*.  ]*,  I») 

end 


function  RJaNGENT_JNDEX(A,B) : 


1. 


2. 


3. 


/*  return*  j*  / 

begin  /*  determine  the  appropriate  range  for  j*  */ 
if  x-value  B(rB)  <  x-value  of  A(r^) 

then  begin  a  -  0;  b  ~  tA*,  c  -  0;  d  -  rB;  end 


k  *”  «/b-a+l 


rB*  d-sB;  end 


/*  determine  j  ■  min(j(i),  where  Y^j^  *  min^YljC+h>Yi>c+2h’ *  “ ’ 

Yi  c+(h-l)h^  f0r  1  “  */ 

4.  duplicate  {A(*+*)  ,A(a+2k) , . . .  ,A(a+(k-l)k)}  into  pattern  P2(h-1) 

5.  let  the  resulting  array  be  c(0:  (h-l) (k-l)-l) ; 

6.  duplicate  CB(c-rtx) ,B(c+2h) , . . . ,B(c+(h-l)h)}  into  pattern  Pl(k-l) 

let  the  resulting  array  be  D(0:  (h-l) (k-l)-l) ; 

7.  foreach  i,  0Si<  (h-l)(k-l)  do  GAMMA  (i)  -  0(C(i),D(i)) 

8.  foreach  i,  0^1<  (h-l)(k-l)  do 

begin  J(i)  -  • 

case  i  and  (h-l)  of 

0:  if  GAMMA(i)  <  GAMMA(i+l)  then 

J(i)  -  C+((i  mod  h-l)+l)h 
h-2:  if  GAMMA(i-l)  >  GAMMA (i)  then 

J(i)  -  C+((i  mod  h-l)+l)h 
else  :  if  GAMMA(i-l)  >  GAMMA(i)  <  GAMMA(i+l) 
then  J(i)  -  C+((i  mod  h-l)+l)h 
end  '  ” 
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/*  determine  1  €  [a+k,a+2k, . . .  ,a+(k-l)k]  such  Chet  Yr  7  i*  Che  { 

t  «  **J  1. 

smallest  among  Yi  ^  f°r  1  €  (a+k, . . . ,a+(k-l)kj  X  € 

(X-h+l,X-h+2, . . .,X+h-l}  and  for  soma  j  €  (X -h+l,X-h+2 . A+h-l}  */ j 

j  -  minCj(0:  (h-1) (k-l)-l)}  1. 

duplicate  (B(j-h+i,B(J-h+2) , . . . ,B(j)}  into  pattern  Pl(k-l) 
let  the  resulting  array  be  0(0:  (h-l)(k-l)-l); 
foreach  i,  0  S  i  <  (h-l)(k-l)  do  GAMMA(i)  -  e(C(i),D(i)) 
foreach  i,  0  £  i  <  (h-l)(k-l)  do 
begin  J'(i)  -  - 

case  i  mod (h-1)  of 

0:  if  GAMMA (i)  <  GAM<A(i+l)  then 

J'(i)  -  j  -  h  +  1  +  (i  mod  h-1) 
h-2:  if  GAMMA(i-l)  >  GAMMA (i)  then 

J'(i)  -  j  -h+l+(i  mod  h-1) 
else:  if  GAMMA(i-l)  >  GAMMA(i)  <  GAMMA(i+l) 
then  J'(i)  -  J  -  h  +  1  +  (i  mod  h-1) 

end 

j'  -  min  (j'(0:  (h-1) (k-l)-l)} 
i'  -  min  (i|  J' (i)  -  j'} 

duplicate  (B(j) ,B(J+1) , . . . ,B(j+h-l)}  into  pattern  Pl(k-l) 
let  the  resulting  array  be  0(0:  (h-1) (k-l)-l) 
foreach  i,  Oii<  (h-l)(k-l)  do  GAMMA(i)  -  0(C(i),D(i)) 
foreach  i,  0Si<  (h-l)(k-l)  do 
begin  J'(i)  -  - 

case  i  mod  (h-1)  of 

0:  if  GAMMA(i)  <  GAMMA (i+1)  then 

J'(i)  -  j  +  (i  mod(h-l)) 
h-2:  if  GAMlA(i-l)  >  GAMMA(i)  then 

J’U)  -  j  +  (i  mod(h-l)) 
else:  if  GAMMA(i-l)  >  GAMMA (i)  <  GAMMA (i+1) 
then  J' (i)  -  j  +  (i  mod(h-l)) 

end 

j"  -  nd.n{j'(0:  (h-1) (k-l)-l)} 
i"  -  min(i| J' (i)  •  J"} 

if  j'  -  j"  then  i  -  a+  (lMa(i '  ,i")/h-lj  +  l)k 

else  if  j'  <  j”  then  I  -  a+  (Li'/h-lJ  +  l)k 
•l*e  i  •  +  (U'Vh-lJ  +  l)k 

/*  1*  "  for  some  i  6  (I-k+l.i-k+2, . . .  ,£+k-l}  */ 

dupUcate  U(i-fe+l),A(i-k+2),...,A(i)3  into  pattern  P2(h-1)  r 

let  the  resulting  array  be  C(0:  (h-l) (k-l)-l) 
repeat  steps  6-20  1 

j*  -  min  (J',j") 

dupUcate  U(i)  ,A(i+l) , . . .  ,A(i+k-l)}  into  pattern  P2(h-1)  [ 

let  the  resulting  array  be  C(0:  (h-1) (k-l)-l)  ** 

repeat  steps  6-20 

j*  -  min(j*,j',j")  r 

return  (J*)  Li 
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procedure  MULTI _MAX  (D,tt,  FIRST,  LAST) 

/*  D(0:  n-1)  is  an  array  of  numbers.  rr(i)  is  the  index  of  Che  subarray 
to  which  d(i)  belongs,  that  is  D(i)€Dn^j.  Partition  D  into  sub¬ 
sets  such  that  elements  in  each  subset  have  the  same  rr-values;  find 
16  t  lDj.il  +  lDj|l]  «nd  lD.il-0,  FIRST(i)  and  LAST(i)  are  the 

indices  of  the  first  and  the  last  elements  of  the  subset  D  ,, .  */ 

rr(l) 

begin 

/*  logarithmically  partition  each  subset:  first  determine  the 
first  element  of  each  partition  */ 
for  each  i,  OS  1  <  n  do 

if  FIRST(i)  -  i  then  t(i)  -  1 
else  begin  L  *"  0 

R  -  n-1 
t(i)  -  0 

while  FIRST (i)  >  L  or  LAST(i)  <  R  do 
if  i  -  L(L  +  R)/2J  +1 

then  begin  t(i)  •-  1 

L  -  FIRST(i) 

R  -  LAST(i) 

end 

else  if  i  <  L(L  +  R)/2J 

then  R  -  |_(L  +  R)/2J 
else  L  -  L(L  +  R)/2J  +  1 

end 

/*  classify  each  partition  */ 

2.  call  RANK  (D.t.  CLASS) 

3.  foreach  i,  05  i  <  n  do 

if  t(i)  1  then  CLASS  (i)  -  CLASS  (i)-l 

/*  determine  the  submaximum  in  each  partition,  i.e.,  maximum 
of  the  elements  in  the  same  class  */ 

4.  foreach  i,  0  £  i  <  n  do  begin  SM(i)  -  D(i),  rp*  (i)  -  tt(1)  end 

5.  for  J  -  0  to  log  n-1  do 

foreach  1,  05  i  <  o  do  , 

if  CLASS (i)  -  CLASS (i  +  (l-2BITj (i) )2J ) 

then  SM(i)  -  max(SM(i) ,SM(i  +  (1-2BIT.J (i))2j) ) 

/*  concentrate  the  suboaxioums  into  consecutive  processors  */ 

6.  call  CONCENTRATE (SM. CLASS. t) 

7.  call  CONCENTRATE (tt'  .CLASS, t) 


/*  determine  sequentially  the  maximum  of  the  (at  most  2  logn-1) 
of  the  same  subset  */ 
for  1  1  to  2  logn-1  do 

begin  j  *■  j  +  1 

foreach  i,  0  5  i  <  n-1  do 

if  rr'(i)  -  TT*(i  +  l)  . 

then  SM(i)  -  max(SM(i)  ,SM(i  +  (l-2BIT(i))2J)) 

end 

/*  concentrate  the  maximums  into  consecutive  processors  */ 
foreach  i,  1  $  1  <  n  do 

if  TT'(i-l)  <  rr'(i)  then  t(i)  •-  1 

else  t(i)  -  0 

t(0)  -  1 

call  CONCENTRATE  (SM.tt'  ,  t) 


